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Abstract 
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-We  define  an  historical  algebra  for  historical  relations.  This  historical  algebra,  a 
straightforward  extension  of  the  conventional  relational  algebra,  supports  valid  time,  the 
time  when  an  object  or  relationship  in  the  enterprise  being  modeled  is  valid.  Historical 
versions  of  the  five  relational  operators  union,  difference,  cartesian  product,  selection, 
and  projection  are  defined  and  a  new  operator,  historical  derivation,  is  introduced.  The 
algebra  includes  aggregates  and  is  shown  to  have  the  expressive  power  of  the  temporal 
query  language  TQuel.  The  algebra  is  consistent  with  the  user-oriented  model  of  historical 
relations  as  space-filling  objects  and  satisfies  all  but  one  of  the  associative,  commutative, 
and  distributive  tautologies  involving  union,  difference,  and  cartesian  product.  . 
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Time  is  a  universal  attribute  of  both  events  and  objects  in  the  real  world.  Events  occur  at 
specific  points  in  time;  objects  and  the  relationships  among  objects  exist  over  time.  The  ability  to 
model  this  temporal  dimension  of  the  real  world  is  essential  to  mauiy  computer  system  applications 
(e.g.,  econometrics,  banking,  inventory  control,  medical  records,  and  airline  reservations).  Unfortu¬ 
nately,  conventional  database  management  systems  do  not  support  the  time-varying  aspects  of  the 
real  world.  Conventional  databases  can  be  viewed  as  snapshot  databases  in  that  they  represent  the 
state  of  the  real  world  at  one  particular  point  in  time.  As  a  database  is  changed  to  reflect  changes 
in  the  real  world,  out-of-date  information,  representing  past  states  of  the  real  world,  is  deleted.  The 
need  for  database  support  for  time-varying  information  has  received  increasing  attention;  in  the 
last  five  years,  more  that  80  articles  relating  time  to  information  processing  have  been  published 
[McKenzie  1986). 

In  previous  papers,  we  identified  three  orthogonal  kinds  of  time  that  a  database  management 
system  (DBMS)  needs  to  support:  valid  time,  transaction  time,  and  user-defined  time  [Snodgrass 
Ahn  1985,  Snodgrass  Ahn  1986).  Valid  time  concerns  modeling  time-varying  reality.  The  valid 
time  of,  say,  an  event  is  the  clock  time  at  which  the  event  occurred  in  the  real  world,  independent 
of  the  recording  of  that  event  in  some  database.  Transaction  time,  on  the  other  hand,  concerns 
the  storage  of  information  in  the  database.  The  transaction  time  of  an  event  is  the  transaction 
number  (an  integer)  of  the  transaction  that  stored  the  information  about  the  event  in  the  database. 
User-defined  time  is  an  uninterpreted  domain  for  which  the  DBMS  supports  the  operations  of 
input,  output,  and  perhaps  comparison.  As  its  name  implies,  the  semantics  of  user-defined  time 
is  provided  by  the  user  or  application  program.  These  three  types  of  time  are  orthogonal  in  the 
support  required  of  the  DBMS. 

In  this  paper  we  propose  extending  the  relational  algebra  [Codd  1970)  to  enable  it  to  handle 
valid  time.  The  relational  algebra  already  supports  user-defined  time  in  that  user-defined  time  is 
simply  another  domain,  such  as  integer  or  character  string,  provided  by  the  DBMS  [Bontempo  1983, 
Overmyer  4i  Stonebraker  1982,  Tandem  1983).  The  relational  algebra,  however,  supports  neither 
valid  time  nor  transaction  time.  Hence,  for  clarity,  we  refer  to  the  relational  algebra  hereafter 
eis  the  snapshot  algebra  and  our  proposed  algebra,  which  supports  valid  time,  as  an  historical 
algebra.  We  do  not  consider  here  any  extension  of  the  snapshot  algebra  or  our  historical  algebra 
to  support  transaction  time.  Elsewhere  [McKenzie  &  Snodgrass  1987A)  we  describe  an  approach 
for  adding  transaction  time  to  the  snapshot  algebra  and  show  that  this  approach  applies  without 
change  to  edl  historical  algebras  supporting  valid  time.  This  approach  for  adding  transaction  time 
to  the  snapshot  algebra  and  historical  algebras  also  provides  for  scheme  evolution  [McKenzie  & 
Snodgrass  1987B].  Because  valid  time  and  transaction  time  are  orthogonal,  we  aure  able  to  study 
each  type  of  time  in  isolation. 


1  Approach 

To  extend  the  snapshot  algebra  to  support  valid  time,  we  define  formally  an  historical  algebra. 
We  provide  formal  definitions  for  an  historical  relation,  six  algebraic  operators,  and  two  histori¬ 
cal  aggregate  functions.  We  then  show  that  the  algebra  has  the  expressive  power  of  the  TQuel 
(Temporal  QUEry  Language)  [Snodgrass  1987)  facilities  that  support  valid  time. 


The  algebra  reflects  our  basic  design  goal  to  deflne  an  historical  algebra  that  has  as  many  of  the 
most  desirable  properties  of  an  historical  algebra  as  possible.  For  example,  we  wanted  the  historical 
algebra  to  be  a  straightforward  extension  of  the  snapshot  algebra  so  that  relations  and  algebraic 
expressions  in  the  snapshot  algebra  would  have  equivalent  counterparts  in  the  historical  algebra. 
Yet,  we  also  wanted  the  algebra  to  support  historical  queries  and  adhere  to  the  user-oriented  model 
of  historical  relations  as  space-fllling  objects,  where  the  additional,  third  dimension  is  valid  time. 
Hence,  we  did  not  restrict  historical  relations  to  flrst-normad  form,  insist  on  time-stamping  of  entire 
tuples,  or  require  that  time-stamps  be  atomic-valued  because  each  of  these  restrictions  would  have 
prevented  the  algebra  from  having  other,  more  highly  desirable  properties.  All  design  decisions 
(e.g.,  to  time-stamp  attributes  rather  than  tuples)  were  made  so  that  the  resulting  algebra  would 
possess  a  maximal  set  of  desirable  properties.  In  Section  4  we  briefly  discuss  our  major  design 
decisions  and  the  importance  of  those  decisions  in  determining  the  algebra’s  properties.  A  detailed 
discussion  of  desirable  properties  of  historical  algebras  as  well  as  an  evaluation  of  our  algebra  and 
the  historical  algebras  proposed  by  others,  using  the  identified  properties  as  evaluation  criteria, 
can  be  found  elsewhere  [McKenzie  ic  Snodgrass  1987Cj. 

Efficient  direct  implementation  of  the  algebra  was  not  one  of  our  primary  design  objectives. 
Rather,  our  goal  was  to  define  algebra  that  preserves  the  associative,  commutative,  and  dis¬ 
tributive  properties  of  the  snapshot  algebra  in  order  that  optimization  strategies  developed  for  the 
snapshot  algebra  can  be  applied  in  implementations  of  the  historical  algebra.  Our  formulation  of 
the  algebraic  operators  would  be  inefficient  if  mapped  directly  into  an  implementation.  While  we 
can  envision  more  efficient  implementations,  incorporating  such  efliciencies  in  the  semantics  would 
have  made  it  much  more  complex.  Finally,  we  expect  that  new  optimization  strategies,  unique  to 
the  historical  algebra,  also  will  be  used  in  its  implementation. 

In  the  next  section  we  define  our  historical  algebra.  Then  we  show  that  the  algebra  has  the 
expressive  power  of  the  TQuel  cadculus.  We  conclude  the  paper  with  a  discussion  of  the  major 
design  decisions  we  made  in  defining  the  algebra.  The  notational  conventions  used  in  the  paper 
are  described  in  Appendix  A. 


2  An  Historical  Algebra  for  Historical  Relations 

The  algebra  presented  in  this  section  is  an  extension  of  the  snapshot  algebra.  As  such,  it  retciins  the 
basic  restrictions  on  attribute  values  found  in  the  snapshot  algebra.  Neither  set-valued  attributes 
nor  tuples  with  duplicate  attribute  values  are  edlowed.  Valid  time  is  represented  by  a  set-valued 
time-stamp  that  is  associated  with  individual  attributes.  A  time-stamp  represents  possibly  disjoint 
intervals  and  the  time-stamps  assigned  to  two  attributes  in  a  given  tuple  need  not  be  identical. 

2.1  Historical  Relation 

Assume  that  we  are  given  a  relation  scheme  defined  as  a  finite  set  of  attribute  names  M  =  {Ni,  . 
Nm}-  Corresponding  to  each  attribute  name  Ng,  1  <  a  <  m,  is  a  domain  Da,  an  arbitrary,  non¬ 
empty,  finite  or  denumerable  set  [Maier  83j.  Let  the  positive  integers  be  the  domain  T ,  where  each 


element  of  T  represents  a  time  quantum  [Anderson  82].  Assume  that,  if  t\  immediately  precedes 
tj  in  the  linear  ordering  of  T,  then  ti  represents  the  interval  [ti,  <2)-  The  granularity  of  time  (e.g., 
nanosecond,  month,  year)  associated  with  T  is  arbitrary.  Note  that  because  time  is  a  continuous 
function,  all  measures  of  time  can  be  viewed  as  measures  of  intervals.  Hence,  when  we  speak  of  a 
“point  in  time,”  we  actually  refer  to  an  interval  whose  duration  is  determined  by  the  granularity  of 
the  measure  of  time  being  used  to  specify  that  “point  in  time.”  Also,  let  the  domain  ^{T)  be  the 
power  set  of  T.  An  element  of  ^(T)  is  then  a  set  of  integers,  each  of  which  represents  eui  interval 
of  unit  duration.  Also,  any  group  of  consecutive  integers  ti,  ,  tn  appearing  in  an  element  of 
P(T),  together  represent  the  interval  [ti,  tn  +  1)- 

If  we  let  value  range  over  the  domain  PiU-  ’-uDm  and  valid  range  over  the  domain  JP(T),  we 
can  define  an  historical  tuple  p  as  a  mapping  from  the  set  of  attribute  names  to  the  set  of  ordered 
pairs  {value,  valid), 

P  -.U-*  (PiU.--uP„,  P{T)) 

with  the  following  restrictions: 

•  Va,  1  <  o  <  m,  value{p{Na))  €  Da  and 

•  3o,  1  <  a  <  m,  valid{p{Na))  0. 

Hereafter,  we  will  refer  to  p{Na]  simply  as  p^,  where  a  denotes  attribute  Na  in  scheme  M ,  when 
there  is  no  ambiguity  of  meaning.  Note  that  it  is  possible  for  all  but  one  attribute  to  have  an 
empty  time-stamp. 

Let  P  be  the  domain  of  all  tuples  over  the  attribute  names  of  the  relation  scheme  >/  and  the 
domains  D\,  .. .,  Dm,  and  P{T).  Define  two  tuples,  p,  p'  €  P,  to  be  value-equivalent  if  and  only 
if  Va,  1  <  a  <  m,  va/ue(p„)  =  value{}/^).  An  historical  relation  h  is  then  defined  as  a  finite  set 
of  historical  tuples,  with  the  restriction  that  no  two  tuples  in  the  relation  are  value-equivalent.  P 
represents  the  domain  of  all  historical  relations  on  the  relation  scheme. 

EXAMPLE.  Assume  that  we  are  given  the  relation  scheme  Student  =  {Name,  Course}  and  the 
following  set  of  tuples  over  this  relation  scheme.  For  this  and  all  later  examples,  assume  that  the 
granularity  of  time  is  a  semester  relative  to  the  Fall  semester  1980.  Hence,  1  represents  the  Fall 
semester  1980,  2  represents  the  Spring  semester  1981,  etc. 

S  =  {  ((Phil,  {1,3}),  (English,  {1,3}))  , 

((Norman,  {1,2}),  (English,  {1,2}))  , 

((Norman,  {5,6}),  (Calculus,  {5,6}))  , 

((Phil,  {4}),  (English,  {4}))  } 

For  notational  convenience  we  enclose  each  attribute  value  in  parentheses  and  each  tuple  in  angular 
brackets  (i.e.,  (  }).  We  assume  the  natural  mapping  between  attribute  names  cind  attribute  values 
(e.g..  Name  — *  (Phil,  {1,3}),  and  Course  —*  (English,  {1,3})).  Note  that  S  is  not  an  historical 


relation  because  there  are  value-equivalent  tuples  in  the  set  (the  first  and  fourth  tuples  are  value- 
equivalent).  If  we  replace  the  two  value-equivalent  tuples  in  S  with  a  single  tuple,  then  the  new  set 
Si  is  an  historical  relation. 


Si  =  {  ((Phil,  {1,3,4}),  (English,  {1,3,4}))  , 

((Norman,  {1,2}),  (English,  {1,2}))  , 

((Norman,  {5,6}),  (Calculus,  {5,6}))  }  □ 


2.2  Historical  Operators 

We  present  eight  operators  that  serve  to  define  the  historical  algebra.  Five  of  these  operators 
—  union,  difference,  cartesian  product,  projection,  and  selection  —  are  analogous  to  the  five 
operators  that  serve  to  define  the  snapshot  algebra  for  snapshot  relations  [Ullman  82j.  Each  of 
these  five  operators  on  historical  relations  is  represented  as  op  to  distinguish  it  from  its  snapshot 
algebra  counterpart  op.  Historical  derivation  is  a  new  operator  that  replaces  the  time-stamp  of 
each  attribute  in  a  tuple  with  a  new  time-stamp,  where  the  new  time-stamps  are  computed  from 
the  existing  time-stamps  of  the  tuple’s  attributes.  The  remaining  two  operators,  aggregation  and 
unique  aggregation,  compute  aggregates.  After  defining  the  operators,  we  show  that  all  eight 
preserve  the  value-equivalence  property  of  historical  relations. 

EXAMPLE.  The  three  relations  Si,  Sj,  and  S3  are  used  in  the  examples  that  accompany  the 
definitions  of  the  operators.  Sj,  like  Si,  is  an  historical  relation  over  the  relation  scheme  Student  = 
{Name,  Course}.  S3  is  an  historical  relation  over  the  relation  scheme  Home  =  {Name,  State). 
While  the  attributes  of  a  tuple  in  Si,  S2,  and  Ss  have  the  same  time-stamp,  in  general,  attributes 
within  a  tuple  can  have  different  time-stamps. 


Sj  =  {  ((Phil,  {3,4}),  (English,  {3,4}))  , 

((Norman,  {7}),  (Calculus,  {7}))  , 

((Tom,  {5,6}),  (English,  {5,6}))  } 

S3  =  {  ((Phil,  {1,2,3}),  (Kansas,  {1,2,3}))  , 

((Phil,  {4.5,6}),  (Virginia,  {4,5,6}))  , 

((Norman,  {1, 2,5,6}),  (Virginia,  {1,2,5, 6}))  , 

((Norman,  {7,8}),  (Texas,  {7,8}))  }  □ 


2.2.1  Union 


Let  Q  and  R  be  historical  relations  of  m-tuples  over  the  same  relation  scheme.  Then  the  historical 
union  of  Q  and  R,  denoted  Qu  is  defined  as 

Qu  iZ  =  {q”*  I  Q{q)  A  -"(Br,  r  G  R  A  Va,  1  <  a  <  m,  value(qa)  =  va/uc(ra))} 

U  {r"*  I  R(r)  A  -1(3^,  q  €  Q  A  Vo,  1  <  o  <  m,  value{ra)  —  vcLlue{qa))) 

U  {u”*  \^q3r,qGQArGRA  'ia,  1  <  a  <  m,  vaiue(ua)  =  vaLue{qa)  =  value{ra) 

A  valid {ua)  =  valid{qa)  U  valid [r a)} 

Q  U  iZ  is  the  set  of  tuples  that  are  in  Q,  R,  or  both,  with  the  restriction  that  each  pair  of  value- 
equivalent  tuples  is  represented  by  a  single  tuple.  Note  that  if  a  tuple  in  Q  and  a  tuple  in  R  are 
value-equivalent,  then  they  are  represented  in  Qu  by  a  single  tuple.  The  time-stamp  associated 
with  each  attribute  of  this  tuple  in  Q  U  /Z  is  the  set  union  of  the  time-stamps  of  the  corresponding 
attribute  in  the  value-equivalent  tuples  in  Q  and  R. 

EXAMPLE.  SiUS2  =  {  {(Phil,  {1,3,4}),  (English,  {1,3,4}))  , 

((Norman,  {1,2}),  (English,  {1,2}))  , 

{(Norman,  {5,6,7}),  (Calculus,  {5,6,7}))  , 

{(Tom,  {5,6}),  (English,  {5,6}))  }  □ 

2.2.2  Difference 

Let  Q  and  R  be  historical  relations  of  m-tuples  over  the  same  relation  scheme.  Then  the  historical 
difference  of  Q  and  R,  denoted  Q  -  R,  is  defined  as 

Q  —  R  ^  (q"*  I  Q(q}  A  -'(3r,  r  G  R  A  Va,  I  <  a  <  m,  valur.{qa)  =  value{ra))} 

U  {u*"  I  (3<7  3r,  q  G  Q  A  r  G  R  A  Va,  I  <  a  <  m,  va/ue(uj)  =  value{qa)  —  value[ra) 

A  valid(Ua)  —  valid(qa)  —  valid(ra)) 

A  (3a,  1  <  a  <  m  A  valid{ua]  0) 

} 

Q  —  R  is  the  set  of  all  tuples  that  satisfy  three  criteria.  First,  a  tuple  in  Q  —  iZ  must  have  a  value- 
equivalent  counterpart  in  Q.  Second,  the  time-stamp  of  each  attribute  of  a  tuple  in  Q  -  iZ  must 
equal  the  set  difference  of  the  time-stamps  of  the  corresponding  attribute  in  the  value-equivalent 
tuple  in  Q  and  the  value-equivalent  tuple  in  R,  if  any.  Third,  the  time-stamp  of  at  least  one 
attribute  of  each  tuple  in  Q  —  iZ  must  be  non-empty. 


EXAMPLE. 


Si-S2  =  {  ((Phil,  {1}).  (English,  {!})>  , 

((Norman,  {1,2}),  (English,  {1,2}))  , 

((Norman,  {5,6}),  (Calculus,  {5,6}))  }  □ 

2.2.3  Cartesian  Product 

Let  Q  be  an  historical  relation  of  mi-tuples  and  R  be  an  historical  relation  of  m2-tuples.  Then 

Qx  R,  the  historical  cartesian  product  of  Q  and  R,  is  defined  as 

QxR^ 

|yTni+mj  I  q  €  Q  A  Va,  1  <  a  <  mi,  value{ua)  =  value{qa)  A  valid{ua)  =  valid{qa)) 

A  (3r,  r  €  i?  A  Va,  1  <  a  <  mj,  value{umi+a)  =  valite{ra)  A  valid{um+a)  =  valid{ra)) 

} 

The  cartesian  product  operator  for  historical  relations  is  identical  to  the  cartesian  product  operator 

for  snapshot  relations.  Q  x  R  is  the  set  of  (mi  +  m2)-tuples  whose  components  ui,  . . . ,  Um,  form 

a  tuple  in  Q  and  whose  components  u^n+i,  . . . ,  Umi+rm  form  a  tuple  in  R. 

EXAMPLE. 

Six  S3  =  {  ((Phil,  {1,3,4}),  (English,  {1,3,4}),  (Phil,  {1,2,3}),  (Kansas,  {1,2,3}))  , 

((Phil,  {1,3,4}),  (English,  {1,3,4}),  (Phil,  {4,5,6}),  (Virginia,  {4,5,6}))  , 

((Phil,  {1,3,4}),  (English,  {1,3,4}),  (Norman,  {1,2, 5,6}),  (Virginia,  {1,2,5, 6}))  , 
((Phil,  {1,3,4}),  (English,  {1,3,4}),  (Norman,  {7,8}),  (Texas,  {7,8}))  , 

((Norman,  {1,2}),  (English,  {1,2}),  (Phil,  {1,2,3}),  (Kansas,  {1,2,3}))  , 

((Norman,  {1,2}),  (English,  {1,2}),  (Phil,  {4,5,6}),  (Virginia,  {4,5,6}))  , 
((Norman,  {1,2}),  (English,  {1,2}),  (Norman,  {1, 2,5,6}),  (Virginia,  {1,2, 5, 6}))  . 
((Norman,  {1,2}),  (English,  {1,2}),  (Norman,  {7,8}),  (Texas,  {7,8}))  , 

((Norman,  {5,6}),  (Calculus,  {5,6}),  (Phil,  {1,2,3}),  (Kansas,  {1,2,3}))  , 
((Norman,  {5,6}),  (Calculus,  {5,6}),  (Phil,  {4,5,6}),  (Virginia,  {4,5,6}))  , 
((Norman,  {5,6}),  (Calculus,  {5,6}),  (Norman,  {1,2, 5, 6}),  (Virginia,  {1,2, 5, 6}))  , 
((Norman,  {5,6}),  (Calculus,  {5,6}),  (Norman,  {7,8}),  (Texas,  {7,8}))  } 


Let  this  be  relation  S4  over  the  relation  scheme  {SName,  Course,  HName,  State}.  □ 


2.2.4  Selection 


»-•  -»-«  A-»  A  A  A-A-».A‘^  A>  A  A*  ^  i«^  *«A  Afc-  A>^  AIa  Ai—  AA^  aI^  . 


Let  iJ  be  an  historical  relation  of  m-tuples.  Also,  let  F  be  a  boolean  function  involving 


•  Attribute  names  N\,  ...,  Nm', 


•  Constants  from  the  domains  D\,  . . .,  Dm', 

•  Relational  operators  <,  =,  >;  and 

•  Logical  operators  A,  V,  and  -i 


where,  to  evaluate  F  for  a  tuple  r,  r  6  F,  we  substitute  the  value  components  of  the  attributes  of 
r  for  all  occurrences  of  their  corresponding  attribute  names  in  F.  Then  the  historical  selection  of 
R,  denoted  by  (7jr(F),  is  defined  2is 

fff  (F)  =  {r”*  I  r  e  F  A  F{valtie{ri),  . value{rm))} 

Thus,  ff  is  identical  to  <t  in  the  snapshot  algebra.  &f{R)  is  simply  the  set  of  tuples  in  R  for  which 
F  is  true. 


EXAMPLE. 

Name=HName{^i)  ~ 

{  ((Phil,  {1,3,4}),  (English,  {1,3,4}),  (Phil,  {1,2,3}),  (Kansas,  {1,2,3}))  , 

((Phil,  {1,3,4}),  (English,  {1,3,4}),  (Phil,  {4,5,6}),  (Virginia,  {4,5,6}))  , 

((Norman,  {1,2}),  (English,  {1,2}),  (Norman,  {1,2, 5, 6}),  (Virginia,  {1,2, 5, 6}))  , 
((Norman,  {1,2}),  (English,  {1,2}),  (Norman,  {7,8}),  (Texas,  {7,8}))  , 

((Norman,  {5,6}),  (Calculus,  {5,6}),  (Norman,  {1,2, 5, 6}),  (Virginia,  {1,2, 5, 6}))  , 
((Norman,  {5,6}),  (Calculus,  {5,6}),  (Norman,  {7,8}),  (Textis,  {7,8}))  } 

Let  this  be  relation  S5  over  the  relation  scheme  {SName,  Course,  HName,  State}.  □ 

2.2.5  Projection 

Let  F  be  an  historical  relation  of  m-tuples  and  let  aj,  . . . ,  a,,  be  distinct  integers  in  the  range  1 
to  m.  Then  the  historical  projection  of  F,  denoted  by  ^  (F),  is  defined  as 


>  V  V “v  •>  'j* “  '.•  *  • 


-  {«”  I  (V/,  1  <  /  <  n,  Vt,  t  6  valid{ut), 

3r,  (r€  R 

A  Vh,  1  ^  h  <  n,  vaiue(u/,)  =  vaiue(ra^) 

Ate  valid{ra,)) 

) 

A  (Vr,  (r  €  /?  A  V/,  1  <  /  <  n,  value{ra,)  =  value(ui)), 

V/i,  I  <  h  <  n,  valid{ra^)  Q  valid{uk) 

) 

A  (31,  1  <  /  <  n  A  valid(ui)  ^  0) 

} 

Like  the  projection  operator  for  snapshot  relation,  the  projection  operator  for  historical  relations 
retains,  for  each  tuple,  only  the  tuple  components  that  correspond  to  the  attribute  names  in 
{No, ,  . . . ,  Na„}.  All  other  tuple  components  are  removed.  Value-equivalent  tuples  in  the  resulting 
set  are  then  combined  and  tuples  that  have  an  empty  valid  component  for  all  tuple  components 
are  removed. 


EXAMPLE.  ^sN,yn^,suu{S%]  =  {  ((Phil,  {1,3,4}),  (Kansas,  {1,2,3}))  , 

((Phil,  {1,3,4}),  (Virginia,  {4,5,6}))  , 

((Norman,  {1, 2,5,6}),  (Virginia,  {1,2, 5, 6}))  , 

((Norman,  {1,2, 5,6}),  (Texas,  {7,8}))  } 

Let  this  be  relation  Se  over  the  relation  scheme  Enrollment  =  {Name,  State}.  Also  aissume  that 
in  this  relation  the  time-stamp  associated  with  the  value  of  the  attribute  Name  represents  the 
interval(s)  when  the  specified  student  weis  enrolled  and  that  the  time-stamp  zissociated  with  the 
value  of  the  attribute  State  represents  the  interva!(s)  when  the  student  was  a  resident  of  the 
specified  state.  □ 

The  operator  ir  also  supports  projections  on  expressions.  For  an  arbitrary  n,  let  Evaluei,  1  < 
/  <  n,  be  an  arbitrary  expression  involving  the  attribute  names  Na,  1  <  a  <  m.  Evaluei  is 
evaluated,  for  a  tuple  r,  r  e  R,  by  substituting  the  value  components  of  the  attributes  of  r  for 
ail  occurrences  of  their  corresponding  attribute  names  in  Evaluei.  Also,  let  Evalidt,  1  <  I  < 
n,  be  an  arbitrary  expression  involving  the  attribute  names  Na,  1  <  a  <  m,  where  Evalidt 
is  evaluated  for  a  tuple  r,  r  e  R,  by  substituting  the  valid  components  of  the  attributes  of  r 
for  adl  occurrences  of  their  corresponding  attribute  names  in  Evalidi.  In  addition,  assume  that 
evaluation  of  Evaluei  for  every  tuple  r  produces  an  element  of  the  domain  Dh,  1  <  6  <  m,  and  that 
evaluation  of  Evalidt  produces  an  element  of  the  domain  P{T).  Then  the  definition  of  ir,  now 
denoted  by  ii(Evalue,.  EvaUdi), ....  (Evaiue„,  Evaiid„)(^),  >3  constructed  from  the  definition  above  simply 
by  substituting  Evalueh(r}  for  value(ra,,),  Evalidf,(r)  for  valid{ra,.),  Evaluei(r)  for  value(ra,),  and 
Evalidi(r)  for  valid(ra,).  Note  that  this  definition  of  the  n  operator  is  simply  a  more  general 
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version  of  the  definition  presented  earlier,  where  Na,,  1  <  f  <  n,  is  assumed  to  be  the  ordered  pair 
of  expressions  (A^oi,  A^ai)- 

2.2.6  Historical  Derivation 

The  historical  derivation  operator  £  is  a  new  operator  that  does  not  have  an  analogous  snapshot 
operator.  It  replaces  the  time*stamp  of  each  attribute  in  a  tuple  with  a  new  time-stamp,  where 
the  new  time-stamps  are  computed  from  the  existing  time-stamps  of  the  tuple’s  attributes.  S  is 
effectively  a  combination  of  selection  and  projection  on  a  tuple’s  attribute  time-stamps. 

Severed  functions,  defined  on  the  domains  T  and  P{T),  are  used  either  directly  or  indirectly 
in  the  definition  of  the  historical  derivation  operator.  Before  defining  the  derivation  operator  itself, 
we  describe  informally  these  auxiliary  functions.  Formal  definitions  appear  in  Appendix  B. 

FIRST  takes  a  set  of  times  from  the  domain  P{T)  and  maps  it  into  the  earliest  time  in  the  set. 

LAST  takes  a  set  of  times  from  the  domain  P(T)  and  maps  it  into  the  latest  time  in  the  set. 

PRED  is  the  predecessor  function  on  the  domain  T.  It  maps  a  time  into  its  immediate  predecessor 
in  the  linear  ordering  of  ail  times. 

SUCC  is  the  successor  function  on  the  domain  T.  It  maps  a  time  into  its  immediate  successor  in 
the  linear  ordering  of  all  times. 

EXTEND  maps  two  times  into  the  set  of  times  that  represents  the  interval  between  the  first  time 
and  the  second  time. 

INTERVAL  maps  a  set  of  times  into  the  set  of  intervals  containing  the  minimum  number  of 
non-disjoint  intervals  represented  by  the  input  set.  Each  time  in  the  input  set  appears  in  exactly 
one  interval  in  the  output  set  and  each  interval  in  the  output  set  is  itself  represented  by  a  set  of 
times. 

EXAMPLE.  Consider  the  following  tuple  taken  from  the  relation  Sg  defined  previously: 

r=  ((Norman,  {1,2, 5, 6}),  (Texas,  {7,8})) 


then 


INTERVAL(t;ahd(r(Aame)))  =  {{1,  2},  {5,  6}} 
INTERVAL(va/id(r(5ta(e)))  =  {{7,  8}} 


□ 


Given  these  auxiliary  functions,  we  can  now  define  the  historical  derivation  operator  on  his¬ 
torical  relations.  Let  R  be  an  historical  relation  of  m-tuples.  Let  Va,  1  <  a  <  m,  be  temporal 
functions  involving 


•  Attribute  names  Ni,  ...,  iV^i 

•  Constants  from  the  domain  7  of  non-disjoint  intervals  defined  in  Appendix  B; 

•  Functions  FIRST,  LAST,  and  EXTEND;  and 

•  Set  operators  U,  n,  and  — ; 


and  let  G  be  a  boolean  function  involving 


•  Temporal  functions,  as  just  described; 

•  Relational  operators  <,  =,  and  >;  and 

•  Logical  operators  A,  V,  and  -i. 


The  functions  G  and  Vaj  1  <  are  always  evaluated  for  a  specific  assignment  of  non- 

disjoint  intervzds  to  attribute  names  . . . ,  Nm.-  G  evaluates  to  either  true  or  false  and 
evaluates  to  an  element  of  P{T).  For  a  tuple  r,  r  €  R,  and  intervals  1  <  c  <  m,  1^^  € 
INTERVAL(t;a/id(re)),  we  evaluate  G(Ifti,  •••>  substituting  for  all  occurrences  of 

Nc  in  G.  Likewise,  we  evaluate  •••»  by  substituting  for  all  occurrences  of  Ng 

in  Va.  If  any  one  of  r’s  attribute  values  has  a  disjoint  time-stamp,  there  will  be  multiple  distinct 
evaluations  of  G  (and  V^)  for  r,  one  for  each  possible  assignment  of  intervals  to  attribute  names, 
each  resulting  in  a  value  of  true  or  false  for  G  (and  a  set  of  time  quanta  for  Va)- 

We  can  now  define  the  derivation  of  the  historical  relation  R,  denoted  6a,  v, . as 


Sg.v, . (r€R 

A  Vo,  1  <  o  <  m, 

(vo/ue(ua)  =  voiue(ro) 

A  (Vt,  t  €  valid{ua), 

3In,  •••  {In:  e  INTERVAL(vo/id(ri))  A 

A  /;v„  e  INTERVAL(«a/id(r,„)) 
A  . . . ,  In„) 

A  t  e  Va{lNi,  •  •  • ,  7yv„) 


A  {'ilN,  •••  V/w„,  (/;v,  e  INTERVAL(va/td(ri))  A  ■■ 
A  e  INTERVAL(vafi(i(r,„)) 
/\G{In^,  •••,  In„)), 

//v„)  C  valid(ua) 


•i 

.j 


A  3a,  1  <  a  <  m  A  valid{ua)  ^  0 

)} 


For  a  tuple  r,  r  G  R,  the  historical  derivation  operator  determines  new  time-stamps  for  r’s  at¬ 
tributes.  The  historical  derivation  function  first  determines  all  possible  assignments  of  intervals 
to  attribute  names  for  which  the  boolean  function  G  is  true.  For  each  assignment  of  intervals  to 
attribute  names  for  which  G  is  true,  the  operator  evaluates  Va»  1  <  o  <  m.  The  sets  of  times 
resulting  from  the  evaluations  of  are  then  combined  to  form  a  new  time-stamp  for  attribute 
Na-  For  notational  convenience,  we  assume  that  if  only  one  V-function  is  provided,  it  applies  to 
all  attributes. 

EXAMPLES. 

^{Namtr\State)=  Name.NatntiSe)  =  {  ((Phil,  (l}),  (Kansas,  {l}))  , 

((Norman,  {1, 2,5,6}),  (Virginia,  (1, 2,5,6}))  ,  } 


In  this  example,  G  is  {Name  H  State)  =  Name  and  Vj  and  are  both  Name.  A  student  tu¬ 
ple  s,  s  €  Se,  satisfies  condition  G  if  the  student  had  at  least  one  interval  of  enrollment  (i.e., 
Iffame  ^  INTERVAL(ua/jd(s(iVame))))  during  which  his  home  state  (i.e.  State)  did  not  change 
(i.e.,  Estate)  =  ^Name,  where  Isiate  €  INTERVAL(va/jd(s(5totc)))).  The  new  time-stamp 

for  each  attribute  of  a  tuple  that  satisfies  G  for  some  assignment  of  intervals  iName  and  htate  is 
simply  the  union  of  the  Isame  intervals  from  each  assignment  of  intervals  that  satisfy  G.  In  the 
first  tuple  in  Se,  there  are  three  intervals,  two  assigned  to  the  attribute  Name  ({!),  {3,4})  and 
one  assigned  to  the  attribute  State  ({1,2,3}).  From  this  tuple,  we  find  that  Phil  was  a  resident  of 

Kansas  during  his  first  interval  of  enrollment  (G({1},  {1,2,3})  =  {1}  n  {1,2,3}  ^  {!})  but  was 
a  resident  of  Kansas  during  only  part  of  his  second  interval  of  enrollment  (G({3,4},  {1,2,3})  = 
{3,4}n{l,2,3}  ^  {3,4}).  Hence,  this  tuple’s  attributes  are  assigned  a  time-stamp  of  {1}  in  the  re¬ 
sulting  relation.  From  the  second  tuple  in  Se  we  find  that  Phil  was  not  a  resident  of  Virginia  during 
his  first  interval  of  enrollment  (G({l},  {4,5,6})  =  {l}n{4,5,6}  ^  {!})  and  lived  in  Virginia  dur¬ 
ing  only  part  of  his  second  interval  of  enrollment  (G({3,4},  {4,5,6})  =  {3,4}  n  {4,5,6}  ^  {3,4}). 
Hence,  the  time-stamp  for  this  tuple’s  attributes  would  be  assigned  the  empty  set  in  the  result¬ 
ing  relation  except  the  definition  of  the  historical  derivation  operator  disallows  tuples  whose  at¬ 
tributes  all  have  an  empty  time-stamp.  This  tuple  is  therefore  eliminated  and  does  not  appear 
in  the  resulting  relation.  From  the  third  tuple  in  Se  we  find  that  Norman  was  a  resident  of  Vir¬ 
ginia  during  both  of  his  intervals  of  enrollment  (G({1,2},  {1,2})  =  {1,2}  n  {1,2}  ^  {1,2}  and 

G({5,6},  {5,6})  =  {5,6}n{5,6}  =  {5,6}).  Hence,  this  tuple’s  attributes  are  assigned  a  time-stamp 
of  {1,2, 5,6}  in  the  resulting  relation.  From  the  fourth  tuple  in  Sg  we  find  that  Norman  was  not  a 
resident  of  Texas  at  any  time  during  his  enrollment  (G({1,2},  {7,8})  =  {l,2}ri{7,8}  ^  {1,2}  and 
G({5,6},  {7,8})  =  {5,6}  n  {7,8}  ^  {5,6});  this  tuple  is  therefore  eliminated  from  the  resulting 
relation. 


^(AfafnenStaie)^\ame  A  (AfamenState)^9,  AfamenStatel^c)  —  {  {(Phil,  {3}),  (Kansas,  {3})J  , 

((Phil,  {4}),  (Virginia,  {4}))  } 


A  student  tuple  s,  s  €  Se,  satisfies  condition  G  if  the  student  had  at  least  one  interval  of  enrollment 
during  which  his  home  state  cheuiged.  The  new  time-stamp  for  each  tuple  that  satisfies  G  for  some 
assignment  of  intervals  If/amt  and  /state  Is  the  union  of  /jvom*  /state  from  each  assignment  of 
intervals  that  satisfy  G.  From  the  first  tuple  in  Sg  we  find  that  Phil  had  one  interved  of  enrollment 

%/  ,  s/ 

during  which  his  home  state  changed  (i.e.,  {3,4}  O  {1,2,3}  ^  {3,4}  and  {3,4}  n  {1,2,3}  0). 

Hence,  this  tuple’s  attributes  are  assigned  a  time-stamp  of  {3,4}  n  {1,2, 3}  =  {3}  in  the  resulting 
relation.  From  the  second  tuple  in  Se  we  find  that  Phil  had  one  interval  of  enrollment  during  which 
his  home  state  changed.  Hence,  this  tuple’s  attributes  are  assigned  a  time-stamp  of  {4}  in  the 
resulting  relation.  Note  that  Norman  does  not  satisfy  the  restriction;  his  home  state  was  the  same 
during  his  two  periods  of  enrollment.  Hence,  the  third  and  fourth  tuples  are  eliminated  from  the 
resulting  relation.  □ 

Note  that  the  historical  derivation  operator  actually  performs  two  functions.  First,  it  performs 
a  selection  function  on  the  valid  component  of  a  tuple’s  attributes.  For  a  tuple  r,  if  G  is  false  when 
an  interval  from  the  valid  component  of  each  of  r’s  attributes  is  substituted  for  each  occurrence 
of  its  corresponding  attribute  name  in  <?,  then  the  temporal  information  represented  by  that 
combination  of  intervals  is  not  used  in  the  calculation  of  the  new  time-stamps  for  r’s  attributes. 
Secondly,  the  derivation  operator  calculates  a  new  time-stamp  for  attribute  Na,  1  <  o  <  m,  from 
those  combinations  of  intervals  for  which  G  is  true,  using  Va-  If  Vj,  ...,  V,,,  are  all  the  same 
function,  the  tuple  is  effectively  converted  from  attribute  time-stamping  to  tuple  time-stamping. 


The  derivation  operator  is  necessarily  complex  because  we  allow  set-valued  time-stamps;  it 
would  have  been  less  complex  if  we  had  disallowed  set-valued  time-stamps.  Then  the  derivation 
operator  could  have  been  replaced  by  two  simpler  operators,  analogous  to  the  selection  and  projec¬ 
tion  operators,  that  would  have  performed  tuple  selection  and  attribute  projection  in  terms  of  the 
valid  components,  rather  than  the  value  components,  of  attributes.  But,  as  we  will  see  in  Section  4, 
diseillowing  set-valued  time-stamps  would  have  required  that  the  algebra  support  value-equivalent 
tuples,  which  would  have  prevented  the  algebra  from  having  several  other,  more  highly  desirable 


properties. 


2.3  Aggregates 

Aggregates  allow  users  to  summarize  information  contained  in  a  relation.  Aggregates  are  catego¬ 
rized  as  either  scalar  aggregates  or  aggregate  functions.  Scalar  aggregates  return  a  single  scalar 
value  that  is  the  result  of  applying  the  aggregate  to  a  specified  attribute  of  a  snapshot  relation. 
Aggregate  functions,  however,  return  a  set  of  scalar  values,  each  value  the  result  of  applying  the 
aggregate  to  a  specified  attribute  of  those  tuples  in  a  snapshot  relation  having  the  same  values  for 
certain  attributes.  Database  management  systems  based  on  the  relational  model  typically  provide 
several  aggregate  operators.  For  example,  Ingres  (Stonebraker  et  al.  1976|  provides  a  count,  sum, 
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average,  minimua,  maximum,  and  any  aggregate  operator.  Ingres  also  provides  two  versions  of  the 
count,  sum,  and  average  operators,  one  that  aggregates  over  all  values  of  an  attribute  and  one 
that  aggregates  over  only  the  unique  values  of  an  attribute. 

Several  researchers  have  investigated  aggregates  in  time-oriented  relational  databases  [Ben- 
Zvi  1982,  Jones  et  al.  1979,  Navathe  ii  Ahmed  1986,  Snodgrass,  et  al.  1987,  Tansel,  et  al.  1985j. 
Their  work  reflects  the  consensus  that  aggregates  when  applied  to  historical  relations  should  return 
not  a  scalar  value,  but  a  distribution  of  scalar  vaJues  over  time.  Jones,  et  al.  also  introduced  the 
concepts  of  instantaneous  aggregates  and  cumulative  aggregates.  Instantaneous  aggregates  return, 
for  each  time  t,  a  value  computed  only  from  the  tuples  valid  at  time  t.  Cumulative  aggregates 
return,  for  each  time  t,  a  value  computed  from  all  tuples  valid  at  any  time  up  to  and  including 
t,  regardless  of  whether  the  tuples  are  still  valid  at  time  t.  Note  that  a  time  t  has  meaning  only 
when  defined  in  terms  of  the  time  granularity.  Hence,  instantaneous  aggregates  can  be  viewed  as 
aggregates  over  an  interval  whose  duration  is  determined  by  the  granularity  of  the  measure  of  time 
being  used.  Others  have  generalized  the  definition  of  instantaneous  and  cumulative  aggregates 
by  introducing  the  concept  of  moving  aggregation  windows  [Navathe  ii  Ahmed  1986].  For  an 
aggregation  window  function  w  from  the  domain  T  into  the  non-negative  integers,  an  aggregate 
returns,  for  each  time  t,  a  value  computed  from  tuples  valid  either  at  time  t  or  at  some  time  in 
the  interval  of  length  immediately  preceding  time  t.  Hence,  an  instemtaneous  aggregate  is 
an  aggregate  with  an  aggregation  window  function  w{t)  =  0  and  a  cumulative  aggregate  is  an 
aggregate  with  an  aggregation  window  function  w{t)  =  oo. 

Klug  introduced  an  approach  to  handle  aggregates  in  the  snapshot  algebra  [Klug  1982).  His 
approach  makes  it  possible  to  define  aggregates  in  a  rigorous  way.  We  use  his  approach  to  define 
two  historical  aggregate  functions  for  our  algebra: 

•  A,  that  calculates  non-unique  aggregates,  and 

•  AU,  that  calculates  unique  aggregates. 

These  two  historical  aggregate  functions  serve  as  the  historiceil  counterpart  of  both  scalar  aggregates 
and  aggregate  functions. 

The  historical  aggregate  functions  must  contend  with  a  variety  of  demands  that  surface  as 
parameters  (subscripts)  to  the  functions.  First,  a  specific  aggregate  (e.g.,  count)  must  be  specified. 
Secondly,  the  attribute  over  which  the  aggregate  is  to  be  applied  must  be  stated  and  the  aggregation 
window  function  must  be  indicated.  Finally,  to  accommodate  partitioning,  where  the  aggregate  is 
applied  to  partitions  of  a  relation,  a  set  of  partitioning  attributes  must  be  given.  These  demands 
complicate  the  definitions  of  A  and  AU,  but  at  the  same  time  ensure  some  degree  of  generality  to 
these  operators. 

For  both  definitions,  let  R  be  an  historical  relation  of  m-tuples  over  the  relation  scheme 
Jt/jf  =  {Ni,  . . .,  Nm}-  Also  let  a,  cj,  . . . ,  c„  be  distinct  integers  in  the  range  1  to  m  and  Q  be  an 
historical  relation  over  the  relation  scheme  Mq,  with  the  restrictions  that  }Jq  C  Mr  and  {Na,  Nc,, 
■  ■■,Nc„}CMq.  Finally,  let  X  =  {Nc,  ,  ...,  N«„}.  If  X  is  empty,  our  historical  aggregate  functions 
simply  calculate  a  single  distribution  of  scaleu*  values  over  time  for  an  arbitrary  aggregate  applied 
to  attribute  Na  of  relation  R.  If  X  is  not  empty,  our  historical  aggregate  functions  calculate,  for 
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each  subtuple  in  Q  formed  from  the  attributes  X,  a  distribution  of  scalar  values  over  time  for  an 
arbitrary  aggregate  applied  to  attribute  Na  of  the  subset  of  tuples  in  R  whose  values  for  attributes 
X  match  the  values  for  attributes  X  of  the  tuple  in  Q.  Hence,  X  corresponds  to  the  by-list  of  an 
aggregate  function  in  conventional  database  query  languages.  Assume,  as  does  Klug,  that  for  each 
aggregate  operation  (e.g.,  count)  we  have  a  family  of  scalar  aggregates  that  performs  the  indicated 
aggregation  on  R  (e.g.,  COUNT^,,  couNTyv,,  COUNTyv^  where  coUNTyy,,  1  <  o  <  m,  counts  the 
(possibly  duplicate)  values  of  attribute  of  /i).  We  will  define  our  historical  aggregate  functions 
in  terms  of  these  scalar  aggregates. 


2.3.1  Partitioning  Function 

Before  defining  the  historical  aggregate  functions  A  and  AU ,  we  define  a  partitioning  function  that 
will  be  used  in  their  definitions. 


PARTITION(/2,  q,  t,  w,  N^,  X)  = 

(u™  I  (3r),  (r  6  A  V/,  1  <  /  <  n,  va[ue[rc,)  =  value{qc,) 

A  Vd,  1  <  d  <  m,  value{ud)  =  value{rd) 

A  Vd,  1  <  d  <  m, 

(  {'it',  t'  €  valid (ud), 

3!d,  {Id  €  INTEKVAL(va/id(rd)) 

A  t  -  w{t)  <l^{Id  nEXTEND(l,  t)  #  0) 

A  t  -  w{t)  >  1  —  (/d  n  EXTEND(f  -  u;((),  t)  0) 

A  t'  E  Id 

) 

) 

A  (V/d,  {Id  €  INTERVAL(va/id(rd)) 

A  t  -  w{t)  <  1  ^  (/d  nEXTEND(l,  t)  yt  0) 

A  t  -  w{t)  >  1  (/d  n  EXTEND(t  -  w{t),  t)  /  0)) 

Id  C  valid{uc) 

)) 

A  valid{ua)  0 
Ail,  I  <  I  <  n,  valid{uc,)  ^  0 
)} 

where  q  G  Q,  t  E  T,  tu  is  an  aggregation  window  function,  and  1  <  a  <  m.  This  function  retrieves 
from  R  those  tuples  that  have  the  same  value  component  for  attribute  Nc,,  I  <  I  <  n,  as  q  and 
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have  time  t  or  some  time  in  the  interval  of  length  u;(t)  immediately  preceding  (  in  the  time-stamp 
of  attributes  Na,  Na,.  ■ and  Ne„.  Note  that  the  time-stamp  of  attribute  Nd,  1  <  d  <  m,  in  the 
resulting  relation  is  constructed  from  those  intervals  in  the  time-stamp  of  attribute  Nd  in  R  that 
contain  time  t  or  some  time  in  the  interval  of  length  w{t)  immediately  preceding  t.  The  predicates 
t  -  w{t)  <  and  t  -  u;(t)  >  1  — ♦  •  •  •  are  used  here  to  ensure  that  PARTITION  is  well-defined 

as  EXTEND  is  defined  only  for  elements  in  the  domain  T. 

EXAMPLES. 

PARTITI0N(S6,  (  ),  5,  0,  Name,  0)  =  {  ((Normaui,  {5,6}),  (Virginia,  {5,6})) 

((Norman,  {5,6}),  (Texas,  0))  } 

Because  time  5  is  specified  and  the  aggregation  window  function,  denoted  by  zero,  is  the  constant 
function  iu(t)  =  0,  tuples  are  selected  whose  time-stamp  for  attribute  Name  overlaps  time  5. 
Only  the  third  and  fourth  tuples  in  Se  satisfy  this  requirement.  The  partitioning  function  here 
effectively  returns  the  tuples  for  those  students  who  were  enrolled  in  school  at  time  5.  Note  that 
the  time-stamp  of  each  attribute  in  the  selected  tuples  has  been  restricted  to  the  interval  from  the 
attribute’s  original  time-stamp  overlapping  time  5,  if  any. 

PARTITI0N(S6,  ((Phil,  {1,3,4}),  (Virginia,  {4, 5,6})),  5,  0,  Name,  {State})  = 

{  ((Norman,  {5,6}),  (Virginia,  {5,6}))  } 

where  Q  is  here  assumed  to  be  Sg.  Tuples  are  selected  for  those  students  who  were  enrolled  in 
school  and  a  resident  of  Phil’s  state  (Virginia)  at  time  5.  Only  the  third  tuple  in  Se  satisfies  this 
requirement.  Although  Phil  was  a  resident  of  Virginia  at  time  5,  he  was  not  enrolled  in  school  at 
time  5.  Hence,  the  second  tuple  in  Se  is  not  included  in  this  partition. 

PARTITION(Se,  ((Phil,  {1,3,4}),  (Virginia,  {4,5,6})),  5,  1,  Name,  {State})  = 

{  ((Phil,  {3,4}),  (Virginia,  {4,5,6})) 

((Norman,  {5,6}),  (Virginia,  {5,6}))  } 

Here  tuples  are  selected  for  those  students  who  were  enrolled  in  school  and  a  resident  of  Virginia 
within  a  year  (tu(t)  =  1)  of  time  5.  Both  the  second  and  third  tuples  in  Se  satisfy  this  requirement. 
The  second  tuple  in  Se  is  now  included  in  the  partition  because  Phil  was  a  resident  of  Virginia  and 
enrolled  in  school  at  time  4.  □ 

2.3.2  Non-unique  Aggregates 

The  historical  aggregate  function  A  calculates,  for  each  tuple  in  Q,  a  distribution  of  scalar  values 
over  time  for  an  arbitrary  aggregate  applied  to  attribute  Na  of  the  subset  of  tuples  in  R  whose 


value  component  for  attribute  Ne,,  1  <  /  <  n,  matches  the  value  component  for  attribute  Ne,  of 
the  tuple  in  Q.  If  X  is  empty,  A  simply  calculates  a  single  distribution  of  scalar  values  over  time 
for  the  aggregate  applied  to  attribute  Na  of  R.  If  we  let  /  represent  an  arbitrary  family  of  scalar 
aggregates  and  w  represent  an  aggregation  window  function,  then  we  can  define  A  on  the  historical 
relations  Q  and  R,  denoted  by  Af^  yv.,  xi.Q,  R),  as 

^I.  w,  n.,x{Q,  R)  = 

Uvi,  i6T(*Jfu{Ar.„}  {{?  II  (y.  {<})  I  <7  €  <3 

At-  tu(t)  <  1  — ♦  {valid(qa)  nEXTEND(l,  f)  ^  0 
AVI,  1  <  /  <  n, 

valid )  nEXTEND(l,  t)  ^  0) 

At  —  iu(t)  >  1  — ♦  {valid{qa)  nEXTEND(t  —  u;(t),  t)  yt  0 
AVI,  1  <  /  <  n, 

vo/td(?e,)  nEXTEND(t  -  tv(t),  t)  0) 
A  y  =  /Ar.(?,  t,  PARTITION(/?,  ?,  t,  w,  N,,  X)) 

})) 

where  “H”  denotes  concatenation  and  Nagg  is  the  attribute  name  assigned  the  aggregate  value 
(y,  (t}).  If  X  is  not  empty,  function  A  first  associates  with  each  time  t  the  partition  of  relation  Q 
whose  tuples  have  t,  or  a  time  in  the  interval  of  length  u;(t)  immediately  preceding  t,  in  the  valid 
component  of  attributes  Na,  Ne, ,  ■■  ■,  and  Ne„.  For  each  of  these  partitions,  A  then  constructs  a  set 
of  historical  tuples.  Each  tuple  in  the  set  contains  all  the  attributes  X  of  a  tuple  q  in  the  partition 
and  a  new  attribute.  This  new  attribute’s  valid  component  is  the  time  t  corresponding  to  the 
partition  and  its  value  component  is  the  scalar  value  returned  by  the  aggregate  /jv.,  when  //v.  is 
applied  to  the  partition  of  R  whose  tuples  have  value  components  that  match  q’s  value  components 
for  attributes  X  and  whose  valid  components  for  attributes  Na,  Ne,,  ■  ■  ■,  and  Ne^  overlap  either  t 
or  the  interval  of  length  w(t)  immediately  preceding  t.  Then  A  performs  an  historical  union  of  the 
resulting  sets  of  historical  tuples  to  produce  a  distribution  of  aggregate  values  over  time  for  each 
tuple  in  Q.  If  X  is  empty,  A  constructs  for  each  time  t  an  historical  relation  that  is  either  empty  or 
contains  a  single  tuple.  If  the  valid  component  of  attribute  Na  of  no  tuple  r  in  R  overlaps  t  or  the 
interval  of  length  w(t)  immediately  preceding  t,  then  the  historical  relation  is  empty.  Otherwise, 
the  historical  relation  contains  a  single  tuple  whose  valid  component  is  the  time  t  and  whose  value 
component  is  the  scalar  value  returned  by  the  aggregate  ///.,  when  is  applied  to  the  partition 
of  R  whose  tuples  have  a  valid  component  for  attribute  Na  that  overlaps  either  t  or  the  interval  of 
length  u;(t)  immediately  preceding  t.  Then  A  performs  an  historical  union  of  the  resulting  sets  of 
historical  tuples  to  produce  a  single  distribution  of  aggregate  values  over  time. 

Note  that  a  tuple  and  a  time  are  passed  as  parameters  to  the  scalar  aggregate  f^,,  along  with 
a  partition  of  R,  in  the  definition  of  A.  Although  most  aggregate  operators  can  be  defined  in  terms 
of  a  single  par2Lmeter,  the  partition  of  R,  the  additional  parameters  are  present  because  aggregates 
that  eveduate  to  events  or  intervals,  one  of  which  is  defined  in  Section  3.3,  require  them. 


EXAMPLES.  ^couNT.o,  Sfote,  ®(^s<aj«(S6),  Se)  —  {  ((1,  {3, 4, 7, 8}))  , 

((2,  {1,2, 5, 6}))  } 

The  function  A  computes  the  number  of  states  in  which  enrolled  students  resided.  Because  w[i)  =  0 
and  the  time  granularity  of  Sg  is  a  semester,  the  resulting  relation  represents  aggregation  by 
semester.  Hence,  the  aggregate  is  in  effect  an  instantaneous  aggregate.  For  the  interval  {1,2}, 
there  were  two  states  (Kansas  in  the  first  tuple  and  Virginia  in  the  third  tuple).  For  the  interval 
{3,4},  there  was  one  state  (Kansas  in  the  first  tuple  at  time  3  and  Virginia  in  the  second  tuple  at 
time  4).  For  the  interval  {5,6},  there  also  was  only  one  state  (Virginia),  but  it  appeared  in  both 
the  second  and  third  tuples.  It  was  counted  twice  because  the  scalar  aggregates  embedded  within 
A  aggregate  over  duplicate  values.  For  the  interval  {7,8},  there  was  only  one  state  (Texas  in  the 
fourth  tuple). 

■^COUNT,  1.  State,  0(^5<ate(S6),  Sg)  =  {  ((l,  {8,9}))  , 

((2.  {1,2, 3, 4, 5, 6}))  , 

((3.  P»)  } 

Again,  A  computes  the  number  of  states  in  which  enrolled  students  resided,  but  now  vj[t)  =  1. 
Hence,  the  resulting  relation  now  represents  aggregation  by  year  (assuming  two  semesters  per  year). 
Although  nine  does  not  appear  in  the  time-stamp  of  attribute  State  in  any  tuple  in  Se,  a  count  of 
one  is  recorded  at  time  9  because  a  tuple,  the  fourth  tuple  in  Se,  falls  into  the  aggregation  window 
at  time  9. 

■^COUNT,  00,  SJa««,  0(^S«ate(S6)>  Se)  —  {  ((2,  {1,2,3}))  , 

((3,  {4,5,6}))  . 

((4,  {7,8,...}))  } 

Now,  with  w[i)  =  oo,  A  computes  a  cumulative  aggregate  of  the  number  of  states  in  which  enrolled 
students  resided. 

•^COUNT,  0,  Name,  {state} (Se,  Se)  =  {  ((Kansas,  {1,2,3}),  (1,  {1,2,3})) 

((Virginia,  {1, 2, 4, 5, 6}),  (1,  {1,2,4})) 

((Virginia,  {1, 2, 4,5, 6}),  (2,  {5,6})) 

((Texas,  {7,8}),  (1,  {7,8}))  } 

Here,  A  computes  the  instantaneous  aggregate  of  the  number  of  enrolled  students  who  resided  in 
each  state.  In  effect,  the  aggregate  is  computed  for  each  subset  of  tuples  in  Se  having  the  same 
value  for  the  attribute  State.  For  example,  the  first  tuple  is  computed  by  selecting  all  the  tuples 
in  Sg  with  a  state  of  Kanseis  and  then  performing  the  aggregate  on  this  (smaller)  set.  □ 


2.3.3  Unique  Aggregates 


The  function  A  allows  its  embedded  scalar  aggregates  to  aggregate  over  duplicate  attribute  values. 
We  now  define  an  historical  aggregate  function  AU ,  identical  to  A  with  one  exception;  it  restricts 
its  embedded  scalar  aggregates  to  aggregation  over  unique  attribute  values.  We  define  AU  on  the 
historical  relations  Q  and  R,  denoted  by  AU x(Q>  “ 

AU R)  — 

Uvt.teri^xuirf,,,}  ({?  II  (y.  (0)  I  ?  e  g 

A  t  -  tt;(0  <  ’  —  ( vaZid )  nEXTEND(l,  t)  0 
A  VZ,  1  <  Z  <  n, 

uaZt<Z(g„)  nEXTEND(l,  t)  ^  0) 

At  —  u;(t)  >  1  — ►  {valid[qa)  H  EXTEND(t  —  w(t),  t)  ^  0 
A  VZ,  1  <  Z  <  n, 

valid{q„)  n  EXTEND(t  -  w(t),  t)  ^  0) 

A  y  =  /7V.(9,  t,  5true.  «(’rAr.(PARTITION(/?,  q,  t,  w,  N^,  X)))) 
})) 

This  definition  differs  from  that  of  A  only  in  that  the  historical  projection  on  attribute  ZV,  of 
PARTITION(...)  followed  by  the  historical  derivation  eliminates  duplicate  values  of  the  aggre¬ 
gated  attribute  before  the  scalar  aggregation  is  preformed. 

EXAMPLE.  -AUcount.o,  5«ot«.  ®(^s«at«(S6).  Se)  =  {  ((1,  {3, 4, 5, 6, 7, 8}))  , 

((2.  {1,2}))  } 

This  relation  differs  from  the  non-unique  variant  only  during  the  interval  {5,6}.  Here,  Virginia 
is  correctly  counted  only  once,  even  though  there  are  two  tuples  valid  during  this  interval  with  a 
state  of  Virginia.  □ 

2.3.4  Expressions  in  Aggregates 

The  functions  A  and  AU  allow  expressions  to  be  aggregated  and  support  aggregation  by  arbitrary 
expressions.  Let  Eaggregate  be  an  arbitrary  expression  involving  u  historical  aggregate  functions. 
Also,  assume  that  the  historical  aggregate  function  applies  the  scalar  aggregate  /„  to  attribute 
Na,  where  the  aggregation  window  function  is  w„,  and  the  partitioning  attributes  are  X„.  Then 
the  definition  of  A,  now  denoted  by 

^/i .  ,  /».  <"1 ,  •.  iu«,  -^1 . Eaggregate  (Q  i  Li), 


"▼v^TYv^rnr 


is  constructed  from  the  definition  of  A  above  simply  by  substituting  y  =  Eaggregate’  for  y  = 
Eaggregate'  is  Eaggregate  where  each  reference  to  the  aggregate  has  been  replaced 
by  the  expression  /uAT,, (?,  t,  PARTITION(i?,  q,  t,  Na,,  -^u))-  With  these  changes,  A 
allows  expressions  to  be  aggregated.  AU  can  be  modified  similarly. 

If  A  and  AU  are  to  support  aggregation  by  arbitrary  expressions,  changes  must  be  made  to 
the  definitions  of  PARTITION,  A,  and  Al/  given  above.  First,  let  Evaluei,  1  <  /  <  o,  be  an 
expression  involving  the  attribute  names  W<.,,  ... ,  Nc„.  Evaluei  is  evaluated  for  a  tuple  r,  r  E  R, 
by  substituting  the  value  components  of  the  attributes  of  r  for  all  occurrences  of  their  corresponding 
attribute  names  in  Evaluei.  Secondly,  let  X  =  {Evaluei,  Evalueg)  and  di,  ...,  dp  be 
the  distinct  integers  in  the  range  1  to  m  such  that  1  <  h  <  p,  appears  in  at  least  one 

Evaluei,  1  <  /  <  o.  Then  new  definitions  of  PARTITION,  A,  and  AU  are  constructed  from  the 
definitions  above  simply  by  substituting  the  predicate  V/,  1  <  /  <  o,  Evaluei{r)  =  Evaluei{q)  for 
the  predicate  V/,  1  <  /  <  n,  t;a/ue(r<.,)  =  value{q^)  and  the  predicate  V/,  I  <  I  <  p,  valtd(u,i,)  ^  0 
for  the  predicate  V/,  1  <  I  <  n,  valtd(ue,)  0  in  the  definition  of  PARTITION  and  substituting 
p  for  n  and  valtd{qd,)  for  valid{qe,)  in  the  definitions  of  A  and  AU.  With  these  chemges,  A  and  AU 
support  aggregation  by  arbitrary  expressions. 


2.4  Preservation  of  the  Value-equivalence  Property 

AAA 

Theorem  1  The  operators  U,  x,  a,  it,  S,  A,  and  AU  all  preserve  the  value- equivalence  property 
of  historical  relations. 

PROOF.  For  the  operators  U,  -,  x,  &,  and  6  we  show  that  the  contrapositive  of  the  theorem 
holds,  that  is,  if  there  are  value-equivalent  tuples  in  an  operator’s  output  relation,  then  there  eire 
value-equivalent  tuples  in  at  least  one  of  its  input  relations.  For  the  operators  x.  A,  and  AU,  we 
show  by  contradiction  that  there  cannot  be  value-equivalent  tuples  in  their  output  relations. 

Case  1.  U.  Assume  that  QOR  contains  at  least  two  value-equivalent  tuples.  From  the  definition 
of  U,  each  tuple  in  QuR  has  a  value-equivalent  tuple  in  Q,  R,  or  both.  If  two  value-equivalent 
tuples  ui  and  U2  in  QuiZ  do  not  have  a  value-equivalent  tuple  in  R,  then  both  are  tuples  in  Q. 
Similarly,  if  they  do  not  have  a  value-equivalent  tuple  in  Q,  then  both  are  tuples  in  R.  If  they 
have  a  value-equivalent  tuple  in  both  Q  and  R,  then  each  was  constructed  from  a  value-equivalent 
tuple  in  Q  and  a  value-equivalent  tuple  in  R.  If  both  ui  and  iij  had  been  constructed  from  the 
same  tuple  in  Q  and  the  same  tuple  in  R,  then  ui  and  uj  would  be,  by  definition,  the  same  tuple. 
Hence,  they  were  constructed  from  different  value-equivalent  tuples  in  Q,  R,  or  both. 

Case  2.  -.  Assume  that  Q  -  R  contains  at  least  two  value-equivalent  tuples.  From  the  definition 
of  each  tuple  in  Q  -  i?  has  a  value-equivalent  tuple  in  Q  but  not  in  iZ  or  a  value-equivalent  tuple 
in  both  Q  and  R.  If  two  value-equivalent  tuples  ui  and  U2  in  Q  —  /Z  do  not  have  a  value-equivalent 
tuple  in  R,  then  both  are  tuples  in  Q.  If  they  have  a  value-equivalent  tuple  in  both  Q  and  R, 
then  each  was  constructed  from  a  value-equivalent  tuple  in  Q  and  a  value-equivalent  tuple  in  R. 
If  both  ui  and  U2  had  been  constructed  from  the  same  tuple  in  Q  and  the  same  tuple  in  R,  then 
uj  and  U2  would  be,  by  definition,  the  same  tuple.  Hence,  they  were  constructed  from  different 
value-equivalent  tuples  in  Q,  R,  or  both. 


.  ^  N.  N.  S.  •- 


Case  S.  X.  Assume  that  Q  x  R  contains  at  least  two  value-equivalent  tuples.  From  the  definition 
of  X,  each  tuple  in  <5  x  /?  is  constructed  from  a  tuple  in  Q  auid  a  tuple  in  R.  If  two  value-equivalent 
tuples  ui  and  U2  in  Q  x  i?  had  been  constructed  from  the  same  tuple  in  Q  and  the  same  tuple  in 
R,  then  ui  and  uj  would  be,  by  definition,  the  same  tuple.  Hence,  they  were  constructed  from 
different  value-equivalent  tuples  in  Q,  R,  or  both. 


Case  4-  Sr.  Assume  that  ap^R)  contains  at  least  two  value-equivalent  tuples.  From  the  definition 
of  ff,  each  tuple  in  ap^R)  is  a  tuple  in  R.  Hence,  any  two  value-equivalent  tuples  in  ap{R)  are  zdso 
tuples  in  R. 

Case  5.  v.  Assume  that  ...,N<^„{R)  contains  at  least  two  value-equivalent  tuples.  For  any 
two  such  tuples  there  will  be  at  least  one  time  that  appears  in  the  time-stamp  of  an  attribute 
of  one  tuple  but  not  the  other  tuple;  otherwise,  they  would  be  identical.  Hence,  let  in  and  uj 
be  two  value-equivalent  tuples  in  such  that  there  is  a  time  t  in  the  time-stamp  of 

attribute  Na,,  1  <  /  <  n,  of  ui  but  not  fkj.  Prom  the  first  clause  of  the  definition  of  t,  there  is 
a  tuple  r,  r  G  R,  that  has  t  in  the  time-stamp  of  attribute  Nai  and  the  same  value  for  attributes 
Nai,  ...,  Na„  as  ill.  But,  from  the  second  clause  of  the  definition,  the  time-stamp  of  attribute 
Na,  of  tuple  r  is  a  subset  of  the  time-stamp  of  attribute  Na,  of  uj,  as  r  also  has  the  same  value  for 
attributes  Na,,  ■  ■  ■ ,  Na„  as  U2.  Hence,  t  is  in  the  time-stamp  of  attribute  Na,  of  U2)  contradicting 
the  assumption  that  t  is  in  the  time-stamp  of  attribute  Na,  of  ui  but  not  U2.  Similarly,  we  arrive  at 
a  contradiction  if  we  assume  that  there  is  a  time  t  in  the  time-stamp  of  attribute  Na,,  I  <  I  <  n, 
of  ti2  but  not  uj.  Hence,  uj  and  U2  have  identical  attribute  time-stamps,  which  implies  that  they 

are  the  same  tuple,  contradicting  the  assumption  that  . Sa„{R)  contains  at  least  two  value- 

equivalent  tuples.  Note  that  the  output  relation  of  it,  unlike  the  output  relations  of  0,  -,  x,  and 
a,  would  not  contain  value-equivalent  tuples  even  if  there  were  value-equivalent  tuples  in  its  input 
relation. 


Case  6.  S.  Assume  that  Sa  v, . v„(^)  contains  at  least  two  value-equivalent  tuples,  uj  and 

From  the  definition  of  6,  each  tuple  in  Sq^v,,  ...,v,„{R)  is  constructed  from  one  value-equivalent 
tuple  in  R.  If  lii  2uid  U2  were  constructed  from  the  same  value-equivalent  tuple  r,  r  E  R,  then  they 
would  be  the  same  tuple,  as  6  requires  not  only  that  every  time  t  in  the  time-stamp  of  attribute 
Na,  1  <  a  <  m,  of  either  ui  or  U2  be  in  Va(-  •  •)  and  satisfy  G(. . .)  for  some  assignment  of  intervals 
from  the  time-stamps  of  r’s  attributes  to  attribute  names  but  that  Va{-  -)  be  a  subset  of  the 
time-stamp  of  attribute  Na  of  both  ui  and  «2.  Hence,  ui  and  uz  were  constructed  from  different 
value-equivalent  tuples  in  R. 

Case  7.  A.  Assume  that  Ay  contains  at  least  two  value-equivalent  tuples.  From 

Case  1  above,  if  A/,  w,N,.x{Q,  contains  value-equivalent  tuples,  then  the  input  relation  to  A’s 
outermost  U  operator  contains  value-equivalent  tuples.  But,  this  relation  is  the  output  of  ii,  whose 
output  relation  was  shown  in  Case  5  above  never  to  contain  veilue-equivalent  tuples.  Hence,  our 
assumption  that  A/,  contains  at  least  two  value-equivalent  tuples  is  contradicted. 

Case  8.  AU .  Simply  replace  A  with  AU  in  Case  7.  | 


.  e.  J".  .• 
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2.5  Summary 

We  first  introduced  historical  relations,  in  which  attribute  values  are  associated  with  set-valued 
time-stamps.  We  then  defined  eight  historical  operators; 

•  Five  operators  are  analogous  to  the  five  standard  snapshot  operators:  union  (u),  difference 
(— ),  cartesian  product  (x),  selection  (ij),  and  projection  (^^). 

•  Historical  derivation  (fi)  effectively  performs  selection  and  projection  on  the  valid-time  dimen¬ 
sion  by  replacing  the  time-stamp  of  each  attribute  of  selected  tuples  with  a  new  time-stamp. 

•  Aggregation  (A)  and  unique  aggregation  (AU)  serve  to  compute  a  distribution  of  single  values 
over  time  for  a  collection  of  tuples 

We  should  mention  several  other  operators  that  can  exist  harmoniously  with  these  eight  op>- 
erators.  Intersection  (  |  (^uotieri  I  f  i  natural  join  {>:].  and  0-join  (>:)  can  ail  be  defined  in 
terms  of  the  five  basic  operators  m  an  identical  fastiion  to  the  definition  of  their  snapshot  coun¬ 
terparts  Finally  the  historical  rollback  operator  ()]  cjefuied  elsewhere  McKenzie  <Sc  Snodgrass 
1987.\  ,  serves  to  generalize  the  aiget.ra  to  fiandle  temp/orai  relations  incorporating  both  valid  and 
tr.an.saction  lime 
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tuple  became  valid  (i.e.,  From)  and  the  time  when  the  tuple  became  invalid  (i.e.,  To).  Also  unlike 
our  historical  algebra,  TQuel  allows  value-equivalent  tuples  in  a  relation  but  assumes  that  value- 
equivalent  tuples  are  coalesced  (i.e.,  tuples  with  identical  values  for  the  explicit  attributes  neither 
overlap  nor  are  adjeicent  in  time).  As  we  will  see  shortly,  it  is  possible  to  convert  the  embedded, 
coalesced  snapshot  relations  used  in  TQuel’s  formal  semantics  to  historical  relations. 


3.1  TQuel  Retrieve  Statement 

Assume  that  we  are  given  the  k  snapshot  relations  iJ'j,  . . . ,  whose  schemes  are  respectively. 


M  =  {iVi.i,  Fromx,  Tox) 

Fromk,  Tok) 


For  notational  convenience,  we  associate  “  '  ”  with  TQuel  relations,  tuple  variables,  and  ex¬ 
pressions  to  differentiate  them  from  their  counterparts  in  the  historical  algebra  and  assume  that 
iVi  1,  .. . ,  iVfc  are  unique.  Furthermore,  let  t'l,  ij,  ...,  t„  be  integers,  not  necessarily  distinct, 
in  the  range  1  to  A:  and  at,  1  <  /  <  n,  be  a  distinct  integer  in  the  range  1  to  m,, .  Then,  the  TQuel 
retrieve  statement  has  the  following  syntax 

range  of  r\  is  R\ 


range  of  is 

retrieve  into  fi't+i  i  = ’’n . (l) 

valid  from  o  to  ^ 
where  tp 
when  r 

This  statement  computes  a  new  relation  over  the  relational  scheme 


=  {^*+1,1,  •••,  Sk+i.n,  Fromt+i,  Tok+i} 


Us  tuple  calculus  statement  has  the  following  form 


I  (3r;)--(3r') 

(rj  e  iJ'i  A  •  •  •  A  r*  e 

A  u{Nk+i,i)  =  r'^(yv.-,.a,)  A  A  u{Nk+i,n)  = 

A  u{Fromk+i)  =  ri(roi)),  . .. ,  {r'^{Fromk),  r',^{Tok))) 

A  uiTok+i)  =  ^^i{r[{Fromi),  r\[Toi)),  ...,  {r\{FTomk) ,  »-i.(ro*)))  (2) 

A  Before{u{Fromk+i),  u(roik+i)) 

A  r;((ri(Fromi),  T\{Toy)),  ...,  (ri(/romfc),  r‘^{Tok))) 

)} 

where  Before  is  the  “<”  predicate  on  integers,  the  ordered  pair  (r|(From,),  r'-[Toi)),  I  <  i  <  k, 
represents  the  interval  (r((From,),  r[{Toi)),  and  and  F',.  are  the  denotations  described 

below  of  rl>,  V,  X,  and  r  respectively. 

is  obtained  by  replacing  each  occurrence  of  an  attribute  reference  rJ.A/,  1  <  «  <  fc,  15 
a  <  m,-,  in  «/>  with  rKM.a)  and  each  occurrence  of  a  logical  operator  with  its  corresponding  logical 
predicate.  That  is, 

r'i-Ni.a  -  r'(Ni_a), 
and  -+  A, 
or  — *  V,  and 
not  — ♦  -i. 

ard  are  obtained  by  replacing  each  occurrence  of  a  tuple  variable  r'  in  v  and  x  ^ith 
the  ordered  pair  (r'^( From,),  rJ(ro,))  and  each  occurrence  of  a  temporal  constructor  with  a  corre¬ 
sponding  function.  That  is, 

r'  —  (r'(From,),  r‘(Toi)) 
begin  ol  I  -*  beginof{I), 
end  of  /  — »  endof{I), 
h  overlap  h  —*  oveTlap[I\,  I2),  and 
7i  extend  I2  — *  extend{I\,  I2) 

where  beginof,  endof ,  overlap,  and  extend  are  functions  on  the  domain  J .  F'ormal  definitions  for 
these  functions  are  presented  elsewhere  [Snodgrass  1987]. 

r',.  is  obtained  by  replacing  each  occurrence  of  a  logical  operator  in  r  with  its  corresponding 
logical  predicate  according  to  the  rules  given  for  its  replacement  in  \p,  replacing  each  occurrence  of 
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a  tuple  variable  or  temporal  constructor  according  to  the  rules  given  for  their  replacement  in  u  and 
X,  and  replacing  each  occurrence  of  a  temporal  predicate  operator  with  an  analogous  predicate  on 
intervals.  That  is, 

I\  precede  /j  — •  pTtcede(l^,  /j), 
l\  overlap  I2  -•  overlap{  I\ .  /j),  and 
Ii  equal  /»  •  equalili.  /j ) 

where  precede,  overlap  and  equal  are  predicates  on  the  domain  I  Formal  definitions  for  these 
predicates  are  presenteil  -isevknere  Mioiigraas  1987 

3.2  ('<jrr€‘spon«i«*ni  I*  w  .th  fh**  Histuriral  Algebra 

To  -umpare  me  -xpr*-**.  >  *  *ef  i  '  .  ,ei  *n<i  me  maioricai  algebra  presented  in  Section  2,  we 
tirat  reiaU"  -eiai..  ‘  .-ler.  'i;  reaaioni  .n  the  new  TQuel  clauses,  and  hnaliy  the 

.'■etreve  HiaterT;rv  *  .!»-■;  >.  ■.• 

Detiuitioo  1  ■.  •  '  •  •  1  w  ',*ri  'rnttra.ii’J  trxapikot  relation  over  the 

«i-iie>nr  N  .•fci-i.'n;  •t.oitiin  valid  in  our  historical 

J.jr  -  -  .  ... 


..w-  .  •*  •  N 

'  \  '  1  N  1  >  ■  ■  •  — .  '*  I  I  (  • 

■1  '  '  ,  !  aiij'  1  I  '  I , 

f.  \  rt.  N I)  '  ' 'n  >  I  <  ' ( ' !  r  (  T)  I ) !  -  ua/id ( u(  .V,. ) )) 

The  tirsi  laiise  ,1  s  >:  ■  •.  t,  r*'  ’i.ii  --ai  h  tuple  m  TilC]  has  at  least  one  value-equivalent 
tuple  in  H  The  set  ,ti  ;  1  .-e  n  Tie  ift-ri itioii  ensures  that  each  subset  of  value-equivalent  tuples 
in  H  is  represenied  t  v  i  ^  tuple  n  Tl/f')  Note  also  that  the  same  lime-stamp  is  assigned 
to  ea<  h  attribute  ,1  i  ■  t  u-  n  T,'  H' '  I'hi.s  time-stamp  is  simply  the  union  of  the  time-stamps  of 
the  tuple  s  valiie-equiv a,eni  iiiples  in  li'  because  TQuel  assumes  that  value-equivalent  tuples  are 
coalesced,  the  time-stamp  >1'  each  tuple  in  H'  is  a  distinguishable  interval  of  time  in  the  attribute 
time-stamps  of  its  value-eipjivaleni  'Oiinterpart  in  as  shown  by  the  following  lemma. 


Lemma  1  Vr,  r  €  T(i2'),  Vo,  1  <  a  <  m,  V/,  I  €  INTERVAL(ua/ti(r(^„))), 

3r'.  (/  e  R‘ 

A  Vc,  1  <  c  <  m,  value{r{Nc))  =  r'[Nc) 

A  /  =  EXTEND(r'(From),  SUCC(r'(To))) 

) 

PROOF.  Apply  the  definitions  of  coalescing  and  INTERVAL  to  T  and  simplify.  | 

Definition  2  We  define  a  m+2-tuple  TQuel  relation  R'  and  a  m-tuple  relation  R  in  our  hiatorical 
algebra  to  be  equivalent  if,  and  only  if,  R  =  T{R').  In  addition,  we  define  a  TQuel  query  and  an 
expression  in  our  historical  algebra  to  be  equivalent  if,  and  only  if,  they  evaluate  to  equivalent 
relations. 

Let  and  be  the  denotations  in  our  algebra  of  ip,  v,  and  x  respectively.  is  obtmned 

by  replacing  each  occurrence  of  rl(jVf_a),  1  <  t  <  1  <  a  <  rrit,  in  with  and  are 

obtained  by  replacing  each  occurrence  of  am  ordered  pair  {r^^Fromi),  r'.[Toi)),  1  <  «  <  Jfc,  in 
and  with  ^  and  each  occurrence  of  a  TQuel  function  with  its  algebraic  equivalent.  That  is, 

{r'iiFromi),  r|(To.))  -> 

beginofil)  FIRST (/), 

endofil)  ^  LAST(/), 

overlap {Ii,  I2)  —*■  A  H  Jj,  and 

extendih,  h)  EXTEND(FIRST(/i),  LAST(/i)). 

Also  let  Fr  be  the  denotation  in  our  algebra  of  r.  Fr  is  obtauned  by  replacing  each  occurrence 
of  an  ordered  pair  (rl(From,),  rJ(ro,))  and  each  occurrence  of  a  TQuel  function  in  F'^  with  its 
algebraic  equivaJent  according  to  the  rules  above  and  each  occurrence  of  the  predicates  precede, 
overlay,  and  equal  with  its  algebraic  equivalent.  That  is, 

precedeih,  h)  ^  LAST(/i)  <  FIRSTC/,)  V  LAST(/i)  =  FIRST(/2), 
overlap  {Ii,  I2)  — ►  h  h  aJid 
equal{Ii,  h)  h  =  h- 

Note  from  the  definition  of  T(R')  that  a  tuple  in  T(R')  has  the  same  time-stamp  for  each  of  its 
attributes.  Hence,  although  we  require  that  each  occurrences  of  an  ordered  pair  (r'(Fromi),  rj(  Toj)) 
in  and  F',.  be  replaiced  with  the  same  attribute  naune  (i.e.,  iV,,  1),  we  could  have  specified 

any  attribute  of  relation  iJ,-. 

We  will  need  the  following  two  lemmas  in  the  equivalence  proof  to  be  presented  shortly. 


I 

s 


5? 


u*‘ 

I 
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Lemma  2  and  Pr  are  semantically  equivalent  to  and  F'^  respectively.  That  is, 

the  result  of  evaluating  $(,,  Z®*"  triples  r^,  r|  6  /Zj,  1  <  t  <  A,  is  the  same  as  the  result 

of  evaluating  and  P^  for  ihe  intervals  U,  U  =  EXTEND (rj(^’romj),  SUCC(rJ(roj))) 

substituted  for  the  attribute  name  Ni^i- 

PROOF.  The  semantic  equivalence  follows  directly  from  the  definitions  of  the  functions  used  in 
and  P'  [Snodgreas  1987].  | 


Lemma  3  t  G  EXTEND{$'„(. . .),  SUCC{$'^(. . .)))  ^  5e/ore{$'„(. . .),  $'^(...)). 

PROOF.  It  follows  directly  from  the  definition  of  EXTEND,  given  in  Appendix  B,  that  t  € 
EXTEND($(,(. . SUCC($]^{. . .)))  implies  $(,(...)  <  t  <  $!^(...)),  which  in  turn  implies 

Having  defined  the  algebraic  equivalents  of  TQuel  relations  ztnd  expressions  in  the  new  TQuel 
clauses,  we  can  now  define  the  algebraic  equivalent  of  a  TQuel  retrieve  statement.  Every  Quel 
retrieve  statement  (a  target  list  and  where  clause)  is  equivalent  to  an  algebraic  expression  that 
represents  cartesian  product  of  the  relations  associated  with  tuple  variables,  followed  by  selection 
by  the  where-clause  predicate,  and  then  projection  on  the  attributes  in  the  target  list.  Similarly, 
every  TQuel  retrieve  statement  is  equivalent  to  an  algebraic  expression  that  represents  cartesian 
product  of  the  referenced  relations,  followed  by  selection  by  the  where-clause  predicate,  historicad 
derivation  as  specified  by  the  when  and  valid  clauses,  and  then  projection  on  the  attributes  in  the 
target  list. 

Theorem  2  Every  TQuel  retrieve  statement  of  the  form  of  (1)  found  on  page  22  is  equivalent  to 
an  expression  in  our  historical  algebra  of  the  form 

^  =  '^^...•1 . N,v.,.„(fir„EXTEND(*.,SUCC(*J)(^**(T(iZi)x  ...  xT(i2*)))).  (3) 

PROOF.  To  prove  that  R  and  are  temporally  equivalent,  we  must  show  that  R  =  T(i?^^j). 
From  set  theory  and  the  definition  of  T,  it  follows  that  R  and  T(iJ^^j)  axe  equal  if,  and  only  if, 
the  following  holds. 


(Vr,  r  &  R,  Va,  I  <  a  <  n,  Vt,  t  E  valid{r{Na)), 
l^k+i  ^  ^k+l 

A  Vc,  1  <  c  <  n,  value{r{Nc))  =  r'^^i{Nk+i,c) 

A  t  e  EXTEND(r;^i(Eromfc+i),  SUCC(r' ro*+i))) 

) 

) 

A  (Vr,  reR,  Vr^^j,  A  Va,  1  <  a  <  n,  =  value{r(Na))), 


I 


I 


& 


I 


% 


i 


Vc,  1  <  c  <  n, 


EXTEND(r]^+i(i!Vomi+i),  SUCC(ri^j(rok+i)))  C  valid{r{Nc)) 


) 


To  prove  the  validity  of  (4),  we  show  that  the  tuple  calculus  for  R  reduces  to  (4).  First,  construct 
the  tuple  calculus  statement  for  R  from  the  definitions  of  the  historical  operators  x,  a,  S,  and 
using  straightforward  substitution,  change  of  variable,  and  simplification  (i.e.,  the  definition  of 
T(i2'^)x  ...xT(iZj^)  obtained  from  the  x  operator  is  substituted  for  references  to  the  historical 
relation  in  the  definition  of  a,  etc.). 


. EXTEND(*.,SUCC{*;f))(^«,»(T(i?'i)x  •  •  •  xT(iJi))))  = 


1 

3 

3 

4 

5 

6 

7 

8 
9 

10 

11 

12 

13 

14 

15 

16 

17 

18 
19 

30 

31 
33 


{r"  I  (Vc,  1  <  c  <  n,  Vt,  t  €  valid {r(Ne)), 

(3n)...(3r0(3/i)...(3/fc), 

(ri€T(f2;)  A.--Ari€T(iJt) 

A  he  INTERVAL(t;alid(ri(JVi.i)))  A  •  •  • 

A  he  INTERVAL(va/id(rk(JVfc_i))) 

AVI,  1  <  /  <  n,  value{r{Ni))  =  voZue(r,-,(iVj,,ai)) 

A  t>o/ue(ri(iNrk_„J)) 

A  r,(/i,  ...,/*) 

A  t  e  EXTEND($4/i,  . . . ,  4),  SUCC($x(/i,  •  •  • ,  A))) 

)) 

A  ((Vri)...(Vrk)(V4)  •••  (V4) 

(rieT(R')  A-..Ark6T(R;) 

A  /i  e  INTERVAL(voZt‘d(ri(JVi,i)))  A  ■  •  • 

A  4  e  INTERVAL(va/j'd(rt(lVjfe,i))) 
A  VZ,  1  <  Z  <  n,  waZue(r,-,(ZV,-,,a,))  =  value{r{Ni)) 

A  'if,i,ivalue(ri(Ni_i)),  ...,  value{rk{Nk,mt))) 

AVr{h,...,h) 


(5) 


). 


Vc,  1  <  c  <  n, 

EXTEND($„(4,  . . . ,  4),  SUCC($;,(4.  ....  4)))  C  valid[r{N,)) 


) 


A  (3c,  1  <  c  <  n  A  valid{r{Nc))  ^  0) 
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The  three  main  clauses  in  the  above  calculus  statement  correspond  to  the  three  clauses  in  the 
definition  of  ir,  which  appears  on  page  8.  The  X  operator  contributes  the  phrase  ri  £  T(i2'j)  A  •  •  •  A 
Tk  €  T(i2j^)  that  appeeirs  in  lines  3  and  12  of  the  calculus  statement.  The  &  operator  contributes 
the  predicate  found  on  lines  7  and  16  2uid  the  S  operator  contributes  the  predicates  found  on  lines 
4-5,  8-9,  13-14,  and  17-20. 

We  now  use  the  definitions  and  lemmas  presented  earlier,  along  with  set  theory,  to  reduce  the 
tuple  calculus  for  R  to  (4).  The  first  clause  in  (5),  along  with  Lemma  1,  implies  that 


Vr,  r  E  R,  Vc,  1  <  c  <  n,  Vt,  t  £  valid{r{Ne)), 

(3rl)---(3r;), 

(^1  £  iZi  A  •  ■  ■  A  £  R'^ 

A  V/,  1  <  /  <  n,  value{r{Ni))  =  rJ,(iVi,,a,) 

(6) 

A  rr(EXTEND(r;(i^fomi),  SUCC(r'i(roi))),  . . . , 

EXTEND(r;(Eromt),  SUCC(ri( Tot)))) 

A  t  £  EXTEND($„(EXTEND(ri(Eromi),  SUCC(ri(Toi))) . 

EXTEND(r;(Eromfe),  SUCC(r't(Tok)))), 
SUCC($x(EXTEND(ri(Eromi),  SUCC(ri(roi))),  ..., 
EXTEND(ri(Fromk),  SUCC(r;(rot))))) 

))) 

Applying  Lemma  2  to  (6)  results  in 


Vr,  r  E  R,  Vc,  1  <  c  <  n,  Vt,  t  £  valid{r{Nc)), 

(3ri)---(3ri), 

(r'  £  A  ■ .  •  A  ri  £  R', 

A  V/,  1  <  /  <  n,  va/ue(r(Ar,))  =  r^^(^„_a,) 

AnK(^i.i).  (7) 

A  r;((ri(fromi),  r\{Toi)),  ■  -  • ,  (ri(Fromk),  ri(Tofc))) 

At  £EXTEND($'„((ri(Frcm,),  r[{Toi)),  ...,  (ri(Frcmk),  r[(Tok))], 

SUCC($'^((r',(Fromi),  r',(Toi)),  (ri(Fromk).  ri(rok)))) 

))) 
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The  third  clause  of  (5)  on  page  27  implies  that  Vr,  r  E  R,  (3c)(3t),  1  <  c  <  n,  t  €  valid{r{Ne)). 
Hence,  applying  Lemma  3  and  the  tuple  calculus  statement  for  Rf^^i  in  (2)  on  page  23  to  (7)  results 
in 


Vr,  r  €  i2,  Vc,  1  <  c  <  n,  Vt,  t  €  valid{r(Ne)), 

A  V/,  1  <  /  <  n,  value{r{Ni))  =  r[^i{Nk+i,i) 

A  t  E  EXTEND(r;^,(f’rom),  SUCC(r;^i(ro))) 

) 

Thus,  the  first  clause  of  (4)  is  shown  to  hold.  A  similar  argument  can  be  made,  starting  with  the 
second  main  clause  of  (5),  to  show  that  the  second  clause  of  (4)  holds.  Since  (4)  holds,  R  and  R'l^+i 
are  equivalent  and  the  historical  algebra  expression  is  equivalent  to  the  indicated  TQuel  retrieve 
statement.  | 


3.3  TQuel  Aggregates 

TQuel  aggregates  [Snodgrass,  et  al.  1987]  are  a  superset  of  the  Quel  aggregates.  Hence,  each 
of  Quel’s  six  non^unique  aggregates  (i.e.,  count,  any,  suo,  avg,  min,  and  max)  2md  three  unique 
aggregates  (i.e.,  countU,  sumU,  and  avgU)  has  a  TQuel  counterpart.  The  TQuel  version  of  each 
of  these  aggregates  performs  the  same  fundamental  operation  as  its  Quel  counterpart,  with  one 
significant  difference.  Because  an  historical  relation  represents  the  changing  value  of  its  attributes 
and  aggregates  are  computed  from  the  entire  relation,  aggregates  in  TQuel  return  a  distribution 
of  values  over  time.  Hence,  while  in  Quel  an  aggregate  with  no  by*list  returns  a  single  value,  in 
TQuel  the  same  aggregate  returns  a  sequence  of  values,  each  assigned  its  valid  times.  When  there 
is  a  by-list,  an  aggregate  in  TQuel  returns  a  sequence  of  values  for  each  value  of  the  attributes  in 
the  by-list. 

Several  aggregates  are  only  found  in  TQuel:  standard  deviation  (stdev  and  stdevU),  average 
time  increment  (avgti),  the  variability  of  time  spacing  (varts),  oldest  value  (first),  newest  value 
(last),  From-To  interval  with  the  earliest  From  time  (earliest),  and  From-To  interval  with  the 
latest  From  time  (latest). 

Each  TQuel  aggregate  has  a  counterpart  in  our  historical  algebra.  The  algebraic  equivalents  of 
TQuel  aggregates  are  defined  in  terms  of  the  historical  aggregate  functions  A  and  AU ,  which  were 
defined  in  Section  2.3.  Before  defining  the  algebraic  equivalents  of  TQuel  aggregates  in  the  context 
of  a  TQuel  retrieve  statement  however,  we  consider  the  families  of  scalar  aggregates  that  appear 
as  parameters  to  A  and  AU  in  the  algebraic  equivalents  of  TQuel  aggregates.  Each  aggregate  in 
one  of  these  families  of  scalar  aggregates  returns,  for  a  partition  of  historical  relation  R  at  time 
the  same  value  returned  by  its  analogous  TQuel  scalar  aggregate  for  a  partition  of  relation  R'  at 
time  t,  where  R  =  T{R'). 


We  define  here  the  families  of  scalar  aggregates  that  appear  as  parzuneters  to  A  and  AU  in  the 


algebraic  equivalents  of  the  TQuel  aggregates  count,  countU,  first,  and  earliest.  We  present 
these  definitions  to  illustrate  our  approach  for  defining  the  feunilies  of  scalu  aggregates  that  appeeir 
in  the  algebraic  equivalents  of  TQuel  aggregates.  The  approach  can  be  used  to  define  the  families 
of  scalar  aggregates  found  in  tb*  <'Jgebraic  equivalents  of  the  other  TQuel  aggregates  as  well. 
The  aggregates  count  emd  countU  ulustrate  how  conventional  aggregate  operators,  now  applied 
to  historical  relations,  czm  be  handled.  The  aggregate  first  is  an  example  of  an  aggregate  that 
evaluates  to  a  non-temporal  domain  such  as  character  but  uses  an  attribute’s  valid  time  in  a  way 
different  from  the  conventional  aggregate  operators.  Finally,  earliest  illustrates  an  aggregate 
that  evaluates  to  em  interveJ. 

For  the  definitions  that  follow,  let  i2  be  an  historical  relation  of  m-tuples  over  the  relation 
scheme  M  =  {Ni,  ,  Nm.)  and  Q  be  an  historical  relation  over  an  arbitrary  subscheme  of  M . 

Although  the  sczdar  aggregate  COUNT,  introduced  on  page  14,  is  sufiicient  to  define  the  algebraic 
equivalent  of  the  TQuel  aggregates  count  and  countU  for  an  aggregation  window  of  length  zero 
(i.e.,  an  instantaneous  aggregate],  it  is  not  sufficient  to  define  the  algebraic  equivalent  of  count 
emd  countU  for  an  aggregation  window  of  any  other  length.  Hence,  we  define  another  family  of 
scalar  aggregates  COUNTINT/V,,  1  <  ^  <  rn,  that  accommodates  aggregation  windows  of  arbitrary 
length  by  counting  intervals  rather  than  values. 


countint^^(9,  t,  iZ)  =  ^  |INTERVAL(t;a/td(ra))| 

reR 

where  iV*  is  an  attribute  of  both  Q  and  R,  q  eQ,  and  t  €  T.  Recall  that  INTEEVAIi,  formally 
defined  in  Appendix  B,  returns  the  set  of  intervals  contained  in  its  argument.  Hence,  countint 
simply  sums  the  number  of  intervals  in  the  time-stamp  of  attribute  Na  of  eeudi  tuple  in  R. 

Next,  we  consider  the  TQuel  aggregate  first.  This  aggregate  requires  a  family  of  scalar 
aggregate  functions  firstvaluEat.,  1  <  a  <  m,  where  FIRSTValuEw^  produces  the  oldest  value  of 
attribute  Na-  That  is, 

FIRSTVALUEAf.(q',  t,  R)  €  {u  \  R  ^  3r,  {r  e  R 

A  Vr'.  r'  €  R,  FIRST(r(Wa))  <  FIRST (r'(iVa)) 
A  u  =  value[r{Na)) 

) 

A  R  =  0  ^  u  =  NULLVALUE(Wa) 

} 

where  NULLVALUE  is  an  auxili€uy  function  that  returns  a  special  null  value  for  the  domain 
associated  with  its  argument.  Note  that  the  set  {u  |  . . .}  need  not  be  a  singleton  set.  If  there  are 
two  or  more  elements  in  the  set,  pirstvalue  returns  only  one  element,  that  element  being  selected 
arbitrarily.  This  procedure  is  the  same  as  that  used  by  the  TQuel  aggregate  first  to  select  the 
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oldest  value  of  an  attribute  when  there  are  multiple  values  that  satisfy  the  selection  criteria.  If  R 
is  empty,  firstvalue  returns  a  special  null  value  for  the  domain  associated  with  attribute  Na. 

Finally,  we  define  the  algebraic  equivalent  of  the  TQuel  aggregate  earliest.  Unlike  other 
TQuel  aggregates,  which  produce  a  distribution  of  scalar  values  over  time,  earliest  produces  a 
distribution  of  intervals  over  time.  Defining  an  algebraic  equivalent  for  this  aggregate  is  slightly 
more  complicated  owing  to  this  distinction.  We  first  introduce  a  family  of  auxiliary  functions 
ORDERINT;^,,  1  <  a  <  m,  which  orders  chronologically  all  distinguishable  intervals  in  the 
time-stamp  of  attribute  Na  for  tuples  of  historical  relation  R. 

S  =  ORDERINT;v.(i2)  ^  (Vr)(V/),  {r  e  R  A  I  e  INTERVAL(va/»d(r(iVa)))), 

3t;,  1  <  u  <  |5|  A  5t,  =  / 

A  Vv,  1  <  u  <  |S|, 

(3r)(B/),  (reR  A  16  INTERVAL(vaRd(r(Na}))  A  S„  =  I) 

A  Vu,  2  <  u  <  jS], 

(FIRST  (S„_i)  <  FIRST(5„) 

V  (FIRST(S„_i)  =  FIRST(S„)  A  LAST(S„_i)  <  LAST(S„))) 

where  S  is  a  sequence  of  length  jSj  and  Su  is  the  element  of  S.  Evaluating  ORDERINT7v,(i?) 
results  in  a  sequence  of  the  intervals  appearing  in  the  time-stamp  of  attribute  Na  of  tuples  in  R. 

The  intervals  are  ordered  from  earliest  starting  time  to  latest  starting  time.  When  two  or  more 
intervals  have  the  same  starting  time,  they  are  ordered  from  the  earliest  stopping  time  to  the  latest 
stopping  time.  The  first  clause  states  that  each  interval  in  the  time-stamp  of  attribute  Na  of  a 
tuple  in  R  appears  in  5,  the  second  clause  states  that  no  additional  interveds  are  present,  and  the 
third  clause  provides  the  ordering  conditions. 

Now,  we  can  define  a  feunily  of  scalar  aggregate  functions  position;v.,  1  <  a  <  »tj,  where 
POSITION  AT,  first  identifies,  for  a  tuple  q  and  time  t,  the  interval  in  the  valid  component  of  attribute 
Na  in  q  that  overlaps  t  and  then  calculates  the  position  of  that  interval  in  ORDERINTa^.  (^), 
for  an  historical  relation  R.  If  no  interval  in  the  valid  component  of  attribute  Na  overlaps  t  or  the 
interval  is  not  in  ORDERUNTat,  (/?),  positional,  returns  zero. 

PosiTioN;v.(<7.  t,  R)  =  u^  ((3/)(35„).  (/  €  INTERVAL(va/id((7(^a))) 

A  1  <  V  <  10RDERINTa/.(R)1 
A  S„  e  ORDERINTal.(R) 

At  6  I  A  I  =  S^) 

)  —•  XI  ~  V 

A  ((V/)(V5„),  (/  €  INTERVAL(vaW(g(Na))) 

A  1  <  V  <  10RDERrNT,v.(/?)| 


A  5„  €  ORDERINTAr.(ii) 

).  t  ^  /  V  /  ^  S„ 

)  -►u  =  0 

Note  that  position,  unlike  countint  and  firstvalue,  requires  parauneters  q  and  t,  as  well  as  R. 

Now  assume  that  we  are  given  a  family  of  scalar  aggregate  functions  smallesTat,,  1  <  a  <  m, 
where  SMALLEST^^,  produces  the  smallest  value  of  numeric  attribute  Na-  That  is, 


SMALLESTAf,(g,  t,  R)  =  u  ^  R  ^  ^  {r  Q.  R 

AW,  r^e  R,  vaIue(r(Na})  <  vaJue(r'(Na)) 

A  u  =  value(r(Na)) 

) 

AiJ  =  0— ►u  =  0 

The  families  of  scalar  aggregates  position  and  SMALLEST  aire  both  needed  to  define  the  algebrsuc 
equivalent  of  the  TQuel  aggregate  earliest  for  attribute  Na  of  relation  R'.  First,  position  is  used 
to  assign  each  interval  in  the  time-stamp  of  attribute  Na  of  a  tuple  in  T{Rf)  to  an  integer  repre¬ 
senting  the  intervzd’s  relative  position  in  the  chronological  ordering  of  intervals.  Then,  smallest 
is  used  to  determine,  from  this  assignment  of  intervals  to  integers,  the  times,  if  any,  when  each 
interval  was  the  earliest  interval.  If  we  assume  an  aggregation  window  function  u;(t)  =  0  and  am 
empty  set  of  by-clause  attributes,  the  algebraic  equivalent  of  the  TQuel  aggregate  earliest  for 
attribute  Na  of  relation  R'  is 

3  (■'^smallest,  0,  Rporition)  X  Rposition)  (8) 

over  the  scheme  U^ariun  =  {NeariiMt.i,  Neariie,t.2}  where 

Npo«tion  =  ^N’p,,.„<,„?£o(>lpOSITION,oo,./V,,0(-ff,  R))  (9) 

over  the  scheme  MpoaiUon  =  {Npo.xtion}- 

EXAMPLE.  If  we  assume  an  aggregation  window  function  u;(t)  =  0  and  an  empty  set  of  by-clause 
attributes,  then  earliest  for  attribute  State  of  relation  Sg  is 


i=^.arli..i,  a  (-^SMALLEST,  0,  aC'^oMttoni  Rpontion)  X  Rpotition)  — 

{  ((1,  {1.2}).  (1.  {1.2}))  , 

((2.  {3}),  (2.  {1,2,3})), 

((3,  {4,5,6}),  (3,  {4.5,6}))  , 

((5,  {7,8}),  (5,  {7.8}))  } 


where  Ryotition  is 


^y,o.iiion  0 (^POSITION,  00,  state,  ^{Se,  Se))  = 

{  ((1,  {1,2})), 

((2,  {1,2,3}))  , 

((3,  {4,5,6}))  , 

((4,  {5,6})), 

((5,  {7,8}))  }  □ 

As  illustrated  in  this  example,  the  algebraic  equivalent  of  earliest  is  a  two-attribute  historical 
relation.  The  valid  component  of  the  first  attribute  is  the  time  when  the  valid  component  of  the 
second  attribute  was  the  earliest  interval.  Also  note  that  the  value  component  of  both  attributes 
is  the  position  of  the  valid  component  of  the  second  attribute  in  ORDERINTjv.(7?). 


3.3.1  TQuel  Aggregates  in  the  Target  List 

In  Section  3.2  we  showed  the  algebraic  equivalent  of  the  TQuel  retrieve  statement  without  aggre¬ 
gates.  We  now  show  the  algebraic  equivalent  of  a  TQuel  retrieve  statement  with  aggregates  in  its 
target  list.  We  consider  changes  to  the  algebraic  expression  to  support  one  non-unique  aggregate 
in  the  target  list  only;  similar  changes  would  be  needed  for  e£w:h  additional  aggregate  in  the  target 
list. 

Once  again  assume  that  we  are  given  the  k  snapshot  relations  R[,  . . . ,  whose  schemes  are 
respectively. 


•Vi  =  {Ai.i,  ....  Ni^rni,  Fromi,  Toi} 
=  {^fc.l,  •  ■  •  ,  ^k,mk, 


where,  for  notational  convenience,  we  eissume  that  Nn,  . . . ,  iV*  are  unique.  Also,  let 


*1)  ‘2i  •••)  *n  ill  jit  ■■■t  Jp  be  integers,  not  necessarily  distinct,  in  the  range  1  to  k, 
indicating  the  tuple  variables  (possibly  repeated)  appearing  in  the  target  list  and  aggregate, 
respectively; 

ot,  1  <  /  <  n,  be  an  integer  in  the  range  1  to  rrii,,  indicating  the  attribute  names  appearing  in 
the  target  list  where  (Vu)(Vv),  (l<u<nA  l<v<nAu9ivA  iu  =  »«),  <^v', 


Ckt  1  <  h  <  p,  be  an  integer  in  the  range  1  to  indicating  the  attribute  names  appearing  in 
the  aggregate  where  (Vu)(Vt;),  (l<u<pA  l<v<pAu^vAjti  =  j„),  c„;  and 


3\t  jit  •■■t  Jz  be  the  distinct  integers  in  ji,  ji,  ...,  jp  where  Ji  =  j\,  indicating  the  i  (non- 
repeated)  tuple  variables  appearing  in  the  aggregate. 

Then,  the  TQuel  retrieve  statement  with  the  aggregate  /(  in  the  target  list  has  the  following  syntax 


range  of  r\  is 

range  oi  is 
retrieve  into 

Nk+l.n+l  •  /{(’•J,  by 

for 

where  t/'i 
when  rj)) 


«'i 

•^^+1  ( ^k+\,  1  *  I  . . n  “  p  Ox  ’ 


valid  from  u  to  x 
where  t/; 
when  T 


.N. 


(10) 


This  statement  computes  a  new  relation  i2jt+i  relational  scheme 


•*^*+1  —  {^fc+l,  1>  •■•I  Nk+l,nt  ^fc+l,n+l,  J’ot+l} 

The  for  clause  specifies  an  aggregation  window  function  for  the  aggregate  /[.  uii  contains  one 
or  more  keywords  that  determine,  2dong  with  the  time  granularity  of  i2(,  . . . ,  R'j^,  the  length  of 
the  aggregation  window  at  each  time  t.  The  keywords  each  instant  represent  the  aggregation 
window  function  w{t)  =  0  (i.e.,  an  instantaneous  aggregate)  and  the  keyword  ever  represents 
the  aggregation  window  function  w[t)  —  oo  (i.e.,  a  cumulative  aggregate).  The  length  of  the 
aggregation  window  specified  by  other  keywords  (e.g.,  each  day,  each  week,  each  year)  is  a 
function  of  the  underlying  time  granularity  of  the  database.  For  example,  if  the  time  granularity 
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is  a  day,  then  w  =  each  week  translates  to  the  aggregation  window  function  u;(t)  =  6.  Also,  the 
aggregation  window  function  need  not  be  a  constant  function.  For  example,  if  the  time  granularity 
is  a  day,  then  ui  =  each  month  translates  to  the  aggregation  window  function  w,  where  w(t)  =  31 
if  t  corresponds  to  January  31  and  w{t)  =  28  it  t  corresponds  to  February  28.  We  let  Clui 
function  denoted  by  u>i  and  the  time  granularity  of  ii'j,  . . . , 

Every  TQuel  retrieve  statement  of  the  form  of  (10)  is  equivalent  to  an  expression  in  our  historical 
algebra  of  the  form 


It  =  irv. 

»l««l 


EXTEND(«„,SUCC(*;^))  n  JV„,i  n  -  n  n 


(11) 


where 


^S3i  -^/i. n„i. TV, . ^3p.cp(T(-R^i)x  ■  xT{R'jJ), 

. (T(^^J X  . . .  xT(;?' J))) 


over  the  scheme  Magg^  =  •••,  I^agg^p},  where  Vu,  1  <  u  <  p-  1,  Wa„,,u  = 

and  Naggi,p  is  til®  attribute  name  associated  with  the  aggregate  value.  Here  we  assume  that  fi  is 
the  family  of  scalar  aggregates  (e.g.,  COUNTINt)  corresponding  to  the  family  of  TQuel  aggregates 
/{  (e.g.,  count).  Expression  (12)  applies  the  where  and  when  predicates  to  the  cartesian  product  of 
the  relations  associated  with  tuples  variables  appearing  in  the  aggregate,  and  applies  the  aggregate 
operator  to  the  result.  Expression  (11)  differs  only  slightly  from  the  expression  (3)  on  page  26  for 
a  retrieve  statement  without  aggregates.  The  expemded  selection  operator  provides  the  necessary 
linkage  between  the  attributes  in  the  aggregate’s  by-list  zuid  corresponding  attributes  in  the  base 
relations.  The  expanded  derivation  operator  imposes  the  TQuel  restriction  that  the  valid  time  of 
tuples  in  the  derived  relation  be  the  intersection  of  the  valid  time  specified  in  the  valid  clause,  the 
valid  times  of  the  tuples  in  the  beise  relations  participating  in  the  aggregation,  and  the  valid  time 
of  the  aggregate  itself.  Of  course,  if  /[  is  a  unique  aggregate,  then  AU  should  be  used  instead  of 
A  in  (12). 

Two  changes  to  (11)  are  required  to  handle  special  cases.  First,  if  a  tuple  variable  1  <  u  <  i, 
does  not  appear  outside  the  aggregate  /( in  (10),  then  i  does  not  appear  in  the  second  subscript 
of  the  S  operator.  Also,  if  ji  appears  neither  outside  the  aggregate  f[  in  (10)  nor  in  its  by  clause, 
then  Ragg^  is  replaced  by 

U  {  ( (NULLV  \LUE(Ar3,,i),  (t  1  Vr,  r  €  Ragg^,  r  ^  vatid{T{Nagg^,p))}) )  } 

The  first  cheuige  removes  the  restriction  that  the  valid  time  of  a  tuple  in  the  derived  relation  must 
intersect  the  valid  time  of  at  least  one  tuple  in  the  base  relation  associated  with  tuple  variable  ju- 
The  second  change,  ensures  that  a  value  (possibly  a  distinguished  null  value)  for  the  aggregate  is 
specified  at  each  time  t,  t  €  T. 
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3.3.2  TQuel  Aggregates  in  the  Inner  Where  Clause 


Aggregates  may  also  appeeir  in  the  where,  when,  and  vcihd  clauses  of  a  TQuel  retrieve  statement. 
We  now  show  the  algebraic  equivalents  of  TQuel  retrieve  statements  with  aggregates  in  these 
clauses,  first  presenting  the  algebraic  equivalent  of  a  TQuel  retrieve  statement  with  an  aggregate 
in  an  inner  where  clause.  Assume  that  a  TQuel  aggregate  /j  appears  in  t/^i  in  (10)  and  let 


9i,  92,  ■  ■  ■ ,  9y  be  integers,  not  necessarily  distinct,  in  the  range  1  to  k,  indicating  the  (possibly 
repeated)  tuple  veiriables  appearing  in  the  nested  aggregate  where  V^u,  1  <  u  <  y,  1  < 
V  <p,  9u=  jv ; 

di,  1  <  /  <  y,  be  ein  integer  in  the  range  1  to  rrig,,  indicating  the  attribute  names  appearing  in 
the  nested  aggregate  where  (Vu)(Vv),  (1  <  u  <  y  A  1  <  t;  <  y  A  u  ^  w  A  pu  =  y„),  du  ^  and 


9i,  92,  Qz  he  the  distinct  integers  in  yi,  g^,  where  yi  =  yi,  indicating  the  z  (non- 

repeated)  tuple  variables  in  the  aggregate. 


Then,  in  has  the  foliowring  syntax 


f2^^1l-^3l,<il  . ^3z'^3z,dz 

ior  wj 
where  i/>2 
when  rj) 


As  this  TQuel  retrieve  statement  is  complicated,  containing  a  nested  aggregate  with  a  fuO  com¬ 
plement  of  by,  for,  where,  and  when  clauses,  we  should  expect  a  somewhat  compUcated  cilgebraic 
equivalent. 

When  modified  to  account  for  /j  in  t/ii,  the  algebraic  equivalent  of  /{,  given  in  (12),  becomes, 


■■■,  ^!p,cp<  Oi,,,  '"/j  +  l  • 

,  I  ,  ,  N-j^^  . . (  (13) 

’  ^Jl  ■  ‘  ■  "*  Ji  '  '  "•>!  +1  .  It  '  I  '  ■  "*31  ’  •  '  '  '  ( 

^  d'2  .1  ^  rV  ,  dy  ,  It^  t  ( 

T(i?;jx{((l,  7))}x...iT(/?'Jii?a„J)))) 
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where  the  attribute  name  Nagg^  here  refers  to  the  aggregate  produced  in  A  by  /i,  the  reference  to 
the  aggregate  in  V'l  is  replaced  by  a  reference  to  Naggg.ui  and 


Ragg,  —  -^/j.  Qw,,  ...  j,.  .... 

. iT(«;,)))) 


over  the  scheme  Magg^  =  {Naggi.ii  ,  R^agg^,y}>  and  fi  is  the  fzunily  of  scalzu'  aggregates  corre¬ 
sponding  to  the  family  of  TQuel  aggregates  /j. 

{((1,  T))}  is  a  constant  relation  containing  a  single  tuple  whose  value  component  may  be 
an  arbitrary  value  from  an  arbitrary  domain.  Here,  we  effectively  add  an  additional  attribute  to 
and  then  use  the  attribute  as  an  implicit  by-list  attribute  to  restrict  tuples  in  the  partition  of 
T{R'^Jx  ■  •  •  xT(i?j^)  at  time  t  to  only  those  tuples  that  satisfy  the  predicate  in  involving  the 
aggregate  /j  at  time  t. 


3.3.3  TQuel  Aggregates  in  the  Inner  When  Clause 


Assume  now  that  the  aggregate  /j  appears  in  ri  in  (11)  rather  than  in  t/>i.  The  only  aggregates 
that  can  appear  in  ri  are  earliest  and  latest.  Therefore,  if  we  let  Ragg2  be  the  two-attribute 
algebraic  equivalent  of  /j,  then  the  algebraic  equivalent  of  /(  would  be  the  same  as  that  given  in 
(13)  for  an  aggregate  in  the  inner  where  clause,  with  one  exception.  The  reference  to  in  rj  is 
replaced  by  a  reference  to  Naggi,y+i,  not  Nagg,,y.  The  valid  component  of  Naggg,y  is  the  time  when 
the  valid  component  of  was  the  oldest  interval,  hence  Nagg,,y+i  is  used  in  evaluating  ri. 

If  we  assume  that  /j  is  earliest,  then  Raggg  is 


Ragg,  -  , ,  ==  SM ALLEST^^^ 

(•Rpo«t»on^T(i2jj)x  •••  xT(iij^)),  (14) 

j  ,  rfj  “  Ar^».*.on  t  A^po»ilion  I  A^Jl ,  1 1  ■■•I  A^9»  .  ( 

^♦*5  (^o»i<»on^T(i?j|)x  •  -  •  xT(i?j^)))) 

i  {Rv  otiiion  0{<(0,  7))})) 

over  the  scheme  Ma,g^  =  ,  iVojja.v+i}  where 

Rposition  =  ^Af,<„.„<,„/o(AposmON,  oo,  /V,,,  ) .  T(/?jJ))  (15) 

Expression  (14),  while  structurally  equivalent  to  expression  (8)  on  page  32,  is  considerably 
more  complex  because  of  the  presence  of  by,  when,  and  where  clauses  in  the  nested  aggregate. 


The  attributes  of  A’s  first  argument  now  include  the  attributes  appearing  in  the  by  clause  and  the 
attributes  of  A’s  second  argument  include  the  attributes  of  relations  associated  with  tuple  variables 
appearing  in  the  aggregate.  Also,  tuples  in  the  second  argument  are  now  required  to  satisfy  the 
where  predicate  and,  for  some  interval  in  the  time-stamp  of  attribute  the  when  predicate. 

Finally,  because  TQuel  assumes  earliest  and  latest  return  T  for  an  empty  partition  of  R! ,  the 
tuple  ( (0,  T) )  is  added  to  RponUon  so  that  T  will  be  considered  the  earliest  interval  at  those  times 
when  the  partition  of  A’s  second  argument  is  empty.  Recall  that  SMALLEST,  defined  on  page  32, 
returns  zero  when  passed  ein  empty  relation. 


3.3.4  TQuel  Aggregates  in  the  Outer  Where  Clause 


Assume  that  the  TQuel  aggregate  f[  appears  in  ip  in  (10)  rather  than  in  the  target  list.  Then,  the 
algebraic  equivalent  of  the  TQuel  retrieve  statement  is 


^  (<5r,.  EXTENDI*. , SUCC(*  J)  n  n  ••  nN,,,i  n 


where  the  reference  to  /(  in  ip  is  replaced  by  a  reference  to  P^aggi.p-  Note  that  the  only  other 
change  from  expression  (11)  is  the  elimination  of  attribute  Nagg^.p  from  the  projection,  since  the 
aggregate  does  not  appear  in  the  target  list. 


3.3.5  TQuel  Aggregates  in  the  Outer  When  Clause 


Assume  now  that  the  aggregate  /{  appears  in  r  in  (10).  Then,  the  algebraic  equivalent  of  the 
TQuel  retrieve  statement  is 


^  ....  N.„,.„(<5r,,EXTEND(*.,SUCC(*„))n  n  ■■  nN,,.,  n 


1  A  A  =  (T(/?i)  X  ■  •  •  XT(R^)X  Raggi))) 


where  the  reference  to  f[  in  r  is  replaced  by  a  reference  to  P^aggi.p+\-  if  the  aggregate  /(  is  in  o  or 
X  rather  than  r,  analogous  changes  would  be  required. 


3.3.6  Multiply-nested  Aggregation 

The  appro2ich  described  above  for  handling  aggregates  in  the  inner  where  and  when  clauses  can  be 
used  to  handle  aggregates  in  a  qualifying  where  or  when  clause  of  an  aggregate  in  the  outer  where, 
when,  or  valid  clauses.  This  method  of  converting  TQuel  aggregates  to  their  algebraic  equivalents, 
when  there  is  an  aggregate  in  a  qualifying  clause,  can  also  handle  an  arbitrzu’y  level  of  nesting  of 
aggregates. 


3.4  Correspondence  Theorems 

Now  that  all  possible  locations  for  aggregates  in  a  TQuel  retrieve  statement  have  been  examined, 
we  can  assert  that 

Theorem  3  Every  TQuel  retrieve  statement  has  an  equivalent  expression  in  our  historical  algebra. 

PROOF.  Induct  on  the  number  of  aggregates  appearing  in  the  statement  to  arrive  at  an  equivalent 
algebraic  expression,  applying  the  replacements  discussed  above  in  Sections  3.3.1  through  3.3.5,  m 
appropriate.  Incorporate  the  handling  of  trans£u:tion  time  via  the  rollback  operator  {p)  as  discussed 
elsewhere  [McKenzie  ic  Snodgrass  1987 A).  Construct  a  tuple  calculus  expression  for  the  retrieve 
statement  and  the  algebraic  expression,  then  prove  equivalence  using  the  technique  used  in  the 
proof  of  Theorem  2.  While  the  proof  is  aided  by  the  presence  of  auxiliary  relations  in  the  tuple 
calculus  semantics  for  aggregates  [Snodgrass  1987],  it  is  still  cumbersome  and  offers  little  additional 
insight.  I 

In  a  similar  fashion,  by  also  using  the  modify_state  and  modily_scheme  commands  described 
elsewhere  [McKenzie  ic  Snodgrass  1987B|,  one  can  construct  equivalent  algebraic  statements  for 
the  TQuel  create,  delete,  append,  replace,  and  destroy  statements. 

Theorem  4  The  historical  algebra  defined  here  is  strictly  more  powerful  than  TQuel. 

PROOF.  The  previous  theorem  shows  that  the  expressive  power  of  the  algebra  is  as  great  as  that  of 
TQuel.  Now,  for  two  TQuel  relations  R[  and  R^,  consider  the  algebraic  expression  T(i?i)  X  T{R!^). 
Because  the  semantics  of  TQuel  requires  that  tuples  rather  than  attributes  be  time-stamped,  this 
algebraic  expression  has  no  counterpau’t  in  TQuel.  Hence,  the  algebra  is  strictly  more  powerful 
than  TQuel.  | 


4  Review  of  Design  Decisions 

In  defining  the  historical  algebra  presented  in  Section  2,  we  were  faced  with  three  major  design 
decisions:  whether  to  time-stamp  tuples  or  attributes,  whether  to  allow  single-valued  or  set-valued 
time-stamps,  and  whether  to  allow  single-vaJued  or  set-valued  attributes.  We  discuss  here  our 
choices  and  the  importance  of  those  choices  in  determining  the  properties  of  the  algebra.  We  also 
mention  the  choices  to  these  design  decisions  made  by  the  developers  of  seven  other  historical 
algebras:  Ben-Zvi’s  Time  Relational  Model  [Ben-Zvi  1982],  Clifford’s  proposed  extension  to  the 
snapshot  algebra  [Clifford  ic  Croker  1987),  Gadia’s  homogeneous  and  multihomogeneous  historical 
algebreis  [Gadia  1984,  Gadia  1986],  Jones’  extension  to  the  snapshot  algebra  to  support  time- 
oriented  operations  for  LEGOL  (Jones  et  al.  1979],  Tansel’s  historiccd  algebra  [Tansel  1986],  and 
Navathe’s  historical  algebra  (Navathe  ic  Ahmed  1986].  A  detailed  review  and  evaluation  of  historical 
algebras,  using  desirable  properties  as  evaluation  criteria,  can  be  found  elsewhere  [McKenzie  ic 
Snodgrass  1987C]. 


■Vi: 
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4.1  Time-stamped  Attributes 


We  decided  to  time-stamp  attributes  rather  than  tuples  to  support  historical  queries.  We  wanted 
the  algebra  to  edlow  for  the  derivation  of  information  valid  at  a  time  t  from  information  in  underlying 
relations  valid  at  other  times,  much  as  the  snapshot  algebra  allows  for  the  derivation  of  information 
about  entities  or  relationships  from  information  in  underlying  relations  about  other  entities  or 
relationships.  This  requirement  implies  that  the  algebra  allow  units  of  related  information,  possibly 
valid  at  disjoint  times,  to  be  combined  into  a  single  related  unit  of  information  possibly  valid  at  some 
other  times.  Support  for  such  a  capability  required  that  we  define  a  cartesian  product  operator 
that  concatenates  tuples,  independent  of  their  valid  times,  and  preserves,  in  the  resulting  tuple,  the 
valid-time  information  for  each  of  the  underlying  tuples.  Only  by  time-stamping  attributes  could 
we  define  a  cartesian  product  operator  with  this  property  wd  maintain  closure  under  cartesian 
product. 

Tansel  and  Gadia  also  time-stamp  attributes.  Only  Tansel’s  algebra  and  Gadia’s  multihomo- 
geneous  model,  however,  allow  tuples  with  disjoint  attribute  time-stamps;  Gadia’s  homongeneous 
model  requires  that  a  tuple’s  attribute  time-stamps  be  identiczd.  Clifford  assigns  a  time-stamp, 
termed  a  lifespan,  to  each  tuple  in  a  relation  and  to  each  attribute  in  the  relation’s  scheme.  The 
lifespan  of  each  attribute  of  a  tuple  is  then  computed  as  the  intersection  of  the  tuple’s  lifespan 
and  the  attribute’s  lifespan,  as  specified  in  the  relation’s  scheme.  Ben-Zvi,  Jones,  and  Navathe  all 
time-stamp  tuples  only. 

4.2  Set-valued  Time-stamps 

We  decided  to  allow  set-valued  attribute  time-stamps  for  several  reasons.  First,  we  weinted  the 
algebra  to  support  the  user-oriented  conceptual  view  of  historical  relations  as  3-dimensional  ob¬ 
jects  [Ariav  1986,  Clifford  &  Tansel  1985j  and  each  historical  operator  to  have  an  interpretation, 
consistent  with  its  semantics,  in  accordance  with  this  conceptual  framework.  That  is,  we  wanted 
the  definitions  of  the  algebraic  operations  to  be  consistent  with  the  conceptual  view  that  historical 
operators  manipulate  space-filling  objects.  For  example,  the  difference  operator  should  take  two 
space-filling  objects  (i.e.,  historical  relations)  and  produce  a  object  that  represents  the  mass  (i.e., 
total  historical  information)  present  in  the  first  object  but  not  present  in  the  second  object.  Note 
that  this  description  of  operations  on  historical  relations  as  “volume”  operations  on  3-dimensional 
objects  is  consistent  not  only  with  the  conceptual  view  of  historical  relations  as  space-filling  ob¬ 
jects  but  also  with  the  semantics  of  the  individual  snapshot  algebraic  operations  as  operations  on 
2-dimensional  tables,  extended  to  account  for  the  additional  dimension  represented  by  valid  time. 
Secondly,  we  wanted  the  algebra  to  satisfy  the  following  commutative,  associative,  and  distributive 
tautologies  involving  union,  difference,  and  cartesian  product  that  are  defined  in  set  theory  [En- 
derton  1977)  as  well  as  the  non-conditional  commutative  laws  involving  selection  and  projection 
presented  by  Ullman  [Ullman  1982),  while  supporting  the  definition  of  historical  intersection  in 
terms  of  historical  difference. 
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'  V  WM  V  W  ^Try  V  V*  ’J»  "Ji  J!  •  T'.'?  •  V»  V  •  t'T V'  V-» ' 


^Fi(^fA^))  =  ^Fji^FiiR)) 
QU{RUS)  =  ((?U/?)U5 
Q  X  (/?  X  S')  =  (Q  X  /?)  X  5 
Qx  (/euS)  =  {QxR)Q{QxS) 

aF{Q(jR)  =  0-/r(<3)UffF(/2) 

&f{Q-R)  =  ^f{Q)  ~  ^f(R) 
*x(Qu  /?)  =  irx(Q)OTtx(R) 
QnR  =  Q-(Q-R) 


We  specifically  did  not  include  one  tautology,  the  distributive  property  of  cartesian  product  over 
difference,  in  this  list  because  it  is  inconsistent  with  the  conceptual  view  of  operations  on  historical 
relations  as  “volume”  operations  on  space-filling  objects  (McKenzie  k  Snodgrass  1987C).  Finally, 
we  wanted  there  to  be  a  unique  representation  for  each  historical  relation  to  keep  the  semantics  of 
the  algebra  as  simple  as  possible. 

If  we  had  decided  to  disallow  set-valued  attribute  time-stamps,  then  we  would  had  to  have  pre- 
mitted  value-equivalent  tuples  to  model  accurately  real-world  temporal  relationships.  Yet,  value- 
equivalent  tuples,  because  they  spread  temporal  relationships  among  attributes  zicross  tuples,  would 
have  caused  problems  in  defining  an  algebra  with  the  above  properties.  If  value-equivalent  tuples 
had  been  allowed  (and  set- valued  attribute  time-stamps  disallowed),  a  unique  representation  for 
each  historical  relation  could  not  have  been  specified  without  imposing  inter-tuple  restrictions  on 
the  attribute  time-stamps  of  value-equivalent  tuples.  Also,  historical  operators,  in  particular  the 
difference  operator,  that  would  have  satisfied  both  the  conceptual  view  of  historical  operations  as 
“volume”  operations  on  space-filling  objects  and  the  above  tautologies,  while  preventing  loss  of 
information  about  temporal  relationships  za  an  operator  side-effect,  could  not  have  been  defined. 

By  allowing  set- valued  attribute  time-stamps  (and  disallowing  value-equivalent  tuples),  we  were 
able  to  define  an  algebra  that  has  the  desired  properties.  Because  value-equivalent  tuples  are 
disallowed,  each  historical  relation  is  guaranteed  to  have  a  unique  representation.  In  addition,  the 
definitions  of  historical  operators  given  in  Section  2  are  consistent  with  the  conceptual  view  of 
historical  operations  as  “volume”  operations  on  space-filling  objects,  and  the  algebra  satisfies  the 
ten  tautologies  listed  above. 

The  decision  to  allow  set-valued  attribute  time-stamps  unfortunately  prevented  the  algebra 
from  having  other  less  desirable,  but  nonetheless  desirable,  properties.  If  we  had  not  allowed 
set-valued  attribute  time-stamps,  we  could  have  retained  the  first-normal-form  property  of  the 
snapshot  algebra.  Also,  we  could  have  replaced  the  single  complex  historical  derivation  operator 
with  two  simple  operators,  one  performing  historical  selection  and  the  other  performing  historical 
projection. 

Clifford  and  Gadia  also  allow  set-valued  time-stamps.  Ben-Zvi,  Jones,  Navathe,  and  Tansel  all 
allow  only  single-valued  time-stamps. 
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4.3  Single-valued  Attributes 

We  decided  to  restrict  attributes  to  single  values  to  retain  in  our  algebra  the  commutative  properties 
of  the  selection  operator  found  in  the  snapshot  algebra.  If  we  had  allowed  set-valued  attributes, 
without  imposing  intra-tuple  restrictions  on  attribute  time-stamps,  then  we  would  had  to  have 
combined  the  functions  of  the  selection  and  historical  derivation  operators  into  a  single,  more 
powerful  operator.  This  consolidation  would  have  been  necessary  to  ensure  that  the  temporal 
predicate  in  the  current  historical  derivation  operator  was  considered  to  be  true  for  an  assignment 
of  intervals  to  attribute  names  only  when  the  predicate  in  the  current  selection  operator  held  for 
the  attribute  values  associated  with  those  intervals.  This  new  operator  would  have  satisfied  the 
commutative  properties  of  the  current  selection  operator  only  in  restricted  cases.  Hence  we  would 
have  limited  the  usefulness  of  key  optimization  strategies  in  future  implementations  of  our  algebra. 

Ben-Zvi,  Jones,  and  Navathe  also  restrict  attributes  to  single  values.  Clifford,  Gadia,  and 
Tansel,  however,  allow  set- valued  attribute  values. 


5  Summary  and  Future  Work 


This  paper  makes  two  contributions.  First,  an  historical  algebra  is  defined  as  a  straightforward  ex¬ 
tension  of  the  conventional  relational  algebra.  Secondly,  the  algebra  is  shown  to  have  the  expressive 
power  of  the  temporal  query  language  TQuel. 

The  design  of  an  historical  algebra  is  a  surprisingly  difficult  task.  Although  defining  an  algebra 
that  has  a  given  property  is  easy,  it  is  much  more  difficult  to  define  an  algebra  that  has  many 
desirable  properties.  We  found  that  many  subtle  issues  arise  when  attempting  to  define  an  algebra 
that  satisfies  several  design  goals.  Also,  all  desirable  properties  of  historical  algebras  are  not 
compatible  [McKenzie  <k  Snodgrass  1987C].  Hence,  the  best  that  can  be  hoped  for  is  not  an  algebra 
with  all  possible  desirable  properties  but  an  algebra  with  a  maximal  subset  of  the  most  desirable 
properties. 

The  historical  cilgebra  defined  in  Section  2  has  what  we  consider  to  be  the  most  desirable  prop¬ 
erties  of  an  historical  algebra.  First,  the  algebra  is  a  straightforward  extension  of  the  snapshot 
algebra.  Each  relation  and  algebraic  expression  in  the  snapshot  algebra  has  an  equivalent  coun¬ 
terpart  in  the  historical  algebra.  Expressions  in  the  snapshot  algebra  can  be  converted  to  their 
historical  equivalent  simply  by  replacing  each  snapshot  operator  with  its  corresponding  historical 
operator  and  converting  the  referenced  snapshot  relations  to  historical  relations  by  assigning  all 
attributes  the  same  time-stamp.  The  historical  operators  U,  x,  ct,  and  x  all  reduce  to  their 
snapshot  counterparts  when  all  attribute  time-stamps  are  the  same.  The  £dgebra  is  also  consistent 
with  the  conceptual  view  of  historical  relations  as  3-dimensional,  space-filling  objects  and  the  view 
of  operations  on  historical  relations  as  “volume”  operations.  In  addition,  the  algebra  supports 
historical  queries,  has  the  expressive  power  of  a  non-procedural  temporal  query  language,  includes 
aggregates,  does  not  exhibit  temporal  data  loss  as  an  operator  side-effect,  and  has  a  unique  repre¬ 
sentation  for  each  historical  relation.  Finally,  the  algebra  satisfies  all  but  one  of  the  commutative, 
associative,  and  distributive  tautologies  involving  union,  difference,  and  cartesian  product  ets  well 
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as  the  non-conditional  commutative  laws  involving  selection  and  projection.  No  other  historical 
algebra  to  our  knowledge  has  all  these  properties. 

The  obvious  future  work  is  an  implementation  of  the  algebra  as  defined  in  Section  2  and  de¬ 
velopment  of  optimization  strategies.  At  this  point,  we  feel  that  the  formal  definition  of  temporal 
databases  and  their  query  languages  h<is  yielded  many  results  (c.f.,  [McKenzie  1986J),  while  im¬ 
plementation  issues  such  as  access  methods,  physical  storage  structures,  and  novel  storage  devices 
remain  largely  unexplored. 
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A  Notational  Conventions 


This  appendix  describes  the  notational  conventions  used  in  this  paper. 


Notation 

u 

X 

a 

ir 

8 

A 

AU 

a,  b,  c,  d 

Da 

F 

f 

G 

g,  «,  j 

h, i 
I 

I 

la 

k 

m,  m,- 

M,  M, 

n 

P{I) 
P[T) 
p,  y 

Q,  R, 


Usage 

Historical  union  operator 
Historical  difference  operator 
Historical  cartesian  product  operator 
Historical  selection  operator 
Historical  projection  operator 
Historical  derivation  operator 

Historical  aggregation  function  for  non-unique  aggregates 
Historical  aggregation  function  for  unique  aggregates 
Attribute  variables 

Arbitrary  flat  domain  associated  with  attribute  Na 
Predicate  in  the  historical  selection  operator 
Scalar  aggregate 

Predicate  in  the  historical  derivation  operator 
Relation  variables 

Variables  ranging  over  attributes  in  target  list,  by-list,  or  aggregate 

Domain  of  intervals 

Interval 

Interval  from  the  time-stamp  of  attribute  Na 
Shorthand  for 
Number  of  relations 

Number  of  attributes  in  relation  schemes  M ,  V, 

Relation  schemes 

Attribute  names 

Length  of  target  list  or  by-list 

Power  set  of  I 

Power  set  of  T 

Number  of  attributes  appearing  in  an  aggregate 
Historical  relations 


^  Ar  ^  ^  S  S 


?,  r,  ri 

Q\  R\  K 


valid{r{Na)) 
valid  {r  a) 
value{r{Na)) 
value{ra) 


Historical  tuple  variables 
TQuel  relations 
TQuel  tuple  variables 
Time  Domain 
Subset  of  T 
Element  of  T 
Temporary  variables 

Temporal  function  in  the  historical  derivation  operator 
Time-stamp  of  attribute  Na  of  tuple  r 
Shorthand  for  valid(r(Na)) 

Value  component  of  attribute  Na  of  tuple  r 
Shorthand  for  value{r{Na)) 

Aggregation  window  function 

Set  of  by-list  attributes  in  an  aggregate 

Number  of  tuple  variables  appearing  in  am  aggregate 


B  Auxiliary  Functions 


We  used  several  auxiliary  functions  in  the  definition  of  the  historical  derivation  operator.  We 
present  here  formal  definitions  for  each  of  those  auxiliary  functions. 

FIRST  takes  a  set  of  times  from  the  domain  P(T)  and  maps  it  into  the  earliest  time  in  the  set. 


FIRST  :  P(T)  T  U1 

FIRST(r)  =  I 

I  t,  t  €  T  A  Vt',  t'  €T,  t  <  t' 


T  =  0 


otherwise 


LAST  takes  a  set  of  times  from  the  domain 


P(T)  and  maps  it  into  the  latest  time  in  the  set. 


LAST  P(T}  ^Tu± 

LAST(T)  =  I 

i  t,  t  €  T  A  Vt',  t'  eT,  t>t' 


r  =  0 


otherwise 


PRED  IS  the  predecessor  function  on  the  domain  T.  It  maps  a  time  into  its  immediate  predecessor 
m  the  linear  ordering  of  aJl  times. 


PRED  ;  r  ^  Tul 

ri  t  =  FIRST(T) 

PRED(f)  =  I 

I  tp,  tp  €  Ta  tp  <  t  A  Vt',  t'  €  Ta  t'  <  t,  t'  <  tp  otherwise 

SUCC  is  the  successor  function  on  the  domain  T.  It  maps  a  time  into  its  immediate  successor  in 
the  linear  ordering  of  all  times. 


SUCC :  7  -  T 

SUCC(t)  ~  tg ,  tg  &  T  A  t5  >  t  A  Vt',  t'  G  7  A  t'  >  t,  t'  >  t5 
Let  the  domain  I  be  the  subset  of  P{T)  that  represents  all  possible  non-disjoint  intervals  of  time. 


I  =  {I\  le  P{T)  tG  I  -  FIRST(/)  <  t  <  LAST(/)} 
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Note  that  I  includes  intervals  of  length  1.  Also  let  P{I)  be  the  power  set  of  I.  While  I  c  P{T), 
each  element  of  P{I)  is  a  set,  each  of  whose  elements  are  also  elements  of  P{T). 

EXTEND  maps  two  times  into  the  set  of  times  that  represents  the  interval  between  the  first  time 
and  the  second  time. 


EXTEND  T  X  7  ^  7  u  _L 


EXTENDIfi.  t;)  ^ 


' 

{r  ti  <  t  <  t;} 


ti  >  <2 


otherwise 


INTERVAL  maps  a  set  of  times  into  the  set  of  intervals  containing  the  minimum  number  of 
non-disjomt  intervals  represented  by  the  input  set.  Each  time  in  the  input  set  appears  in  exactly 
one  interval  in  the  output  set  and  each  interval  in  the  output  set  is  itself  represented  by  a  set  of 
times 


INTERVAL  partitions  a  set  of  times  into  its  corresponding  set  of  intervals  where  each 
interval  is  itself  represented  by  a  set  of  times. 

INTERVAL  :P(T)-^  P{I)  u  0 


f  0 


INTERVAL(r)  =  I 


{/ 


'it,  t  e  I,  t  eT 

A  PRED(0  e  T 
A  SUCC(0  €  T 


PRED(t)  e  / 
SUCC(0  G  /} 


7  =  0 


otherwise 


Note  that  INTERVAL  partitions  a  set  of  times  into  the  minimum  number  of  non-disjoint  intervals 
represented  by  the  set;  each  time  in  T  appears  in  exactly  one  interval. 


